Kubernetes上master无法找到minion


日志报错如下:
shell
Nov 25 09:05:34 fedora201 kube-scheduler: E1125 09:05:34.514772 00496 factory.go:179] Error scheduling apache: failed to find a fit for pod: {{ } {apache default /api/v1beta1/pods/apache 7 2014-11-24 18:27:56 +0800 CST map map[]} {{v1beta1 apache 8e6f8482-73c4-11e4-9c8f-000c2951bd4d [] [{master fedora/apache [] [] 0 0 [] <nil> <nil> false }] {0x9e97d0 <nil> <nil>}} Running map[]} {{ [] [] {<nil> <nil> <nil>}} Waiting map[]} map[]}; retrying
Nov 25 09:05:36 fedora201 kube-scheduler: E1125 09:05:36.319853 00496 event.go:51] Sleeping: Unable to write event: events "" already exists


服务启动状况
shell

netstat -tulnp | grep -E "(kube)|(docker)|(cadvisor)"

tcp 0 0 127.0.0.1:10251 0.0.0.0:* LISTEN 496/kube-scheduler
tcp 0 0 127.0.0.1:10252 0.0.0.0:* LISTEN 499/kube-controller
tcp6 0 0 :::7080 :::* LISTEN 500/kube-apiserver
tcp6 0 0 :::8080 :::* LISTEN 500/kube-apiserver

但是找不到minion
shell

kubectl get minions

NAME
已邀请:

杨章显 - 思科系统运维工程师

赞同来自:


<ol>首先检查你的minion server上kubelet的port 10250是否监听。如果监听,检查从master到minion是否能telent 10250 port。如果不行,检查minion的firewall是否block 10250port。</ol>

jasko

赞同来自:


谢谢回复我的问题。

刚进行检查,10250端口停掉了(记得昨天检查的时候还开着)。
重启服务,确保了端口开启,并且尝试telnet,可以成功。
shell

netstat -tulnp | grep -E "(kube)|(etcd)"

tcp 0 0 172.21.0.123:10250 0.0.0.0:* LISTEN 6935/kubelet

telnet 172.21.0.123 10250

Trying 172.21.0.123...
Connected to 172.21.0.123.
Escape character is '^]'.
123
HTTP/1.1 400 Bad Request
Connection closed by foreign host.

但仍无法看到minion
shell

kubectl get minions

NAME

日志如下
shell
Nov 25 11:48:58 fedora201 kube-scheduler: E1125 11:48:58.132515 00496 reflector.go:106] watch of *api.Pod ended with error: unexpected end of JSON input
Nov 25 11:48:59 fedora201 kube-scheduler: E1125 11:48:59.216259 00496 event.go:51] Sleeping: Unable to write event: events "" already exists

查看了reflector.go文件的106行,内容如下
shell
glog.Errorf("watch of %v ended with error: %v", r.expectedType, err)

继续求助

杨章显 - 思科系统运维工程师

赞同来自:


你的kubernetes版本是那一个?如果是v0.4.x,那应该在apiserver上配置 --machines=x.x.x.x, x.x.x.x是minion的ip地址。如果是v0.5.x,那应该在controller-manager上配置--machines=x.x.x.x, x.x.x.x也是minion的ip地址,检查下你是否配置?

jasko

赞同来自:


版本是0.4+-0.9.git8e1d416.fc20
fedora201是master ip是172.21.0.122
fedora202是minion ipip是172.21.0.123

检查了一下,配置文件里写有相关内容

apiserver配置如下:

shell

cat apiserver

<span style="font-size:16px">#</span>

kubernetes system config

#

The following values are used to configure the kubernetes-apiserver

#

The address on the local server to listen to.

KUBE_API_ADDRESS="--address=0.0.0.0"

The port on the local server to listen on.

KUBE_API_PORT="--port=8080"

How the replication controller and scheduler find the apiserver

KUBE_MASTER="--master=172.21.0.122:8080"

Comma seperated list of minions

KUBELET_ADDRESSES="--machines=172.21.0.123"

Port minions listen on

KUBELET_PORT="--kubelet_port=10250"

Address range to use for services

KUBE_SERVICE_ADDRESSES="--portal_net=10.254.0.0/16"

Add you own!

KUBE_API_ARGS=""}}}

controller-manager配置如下:

shell

cat controller-manager

<span style="font-size:16px">#</span>

The following values are used to configure the kubernetes controller-manager

defaults from config and apiserver should be adequate

Comma seperated list of minions

KUBELET_ADDRESSES="--machines=172.21.0.123"

Add you own!

KUBE_CONTROLLER_MANAGER_ARGS=""

杨章显 - 思科系统运维工程师

赞同来自:


etcd有运行?

jasko

赞同来自:


有运行

netstat -tulnp | grep -E "(kube)|(etcd)"

tcp 0 0 127.0.0.1:10251 0.0.0.0:* LISTEN 500/kube-scheduler

tcp 0 0 127.0.0.1:10252 0.0.0.0:* LISTEN 502/kube-controller
tcp6 0 0 :::7080 :::* LISTEN 503/kube-apiserver

tcp6 0 0 :::8080 :::* LISTEN 503/kube-apiserver

tcp6 0 0 :::7001 :::* LISTEN 600/etcd

tcp6 0 0 :::4001 :::* LISTEN 600/etcd

杨章显 - 思科系统运维工程师

赞同来自:


minion能访问4001?

jasko

赞同来自:


可以访问

telnet 172.21.0.122 4001

Trying 172.21.0.122...
Connected to 172.21.0.122.
Escape character is '^]'.
123
HTTP/1.1 400 Bad Request

Connection closed by foreign host.

刚在重启服务时发现
Nov 25 14:31:35 fedora201 systemd: Started Etcd Server.
Nov 25 14:31:35 fedora201 etcd: Using the directory fedora201.etcd as the etcd curation directory because a directory was not specified.
Nov 25 14:31:35 fedora201 etcd: Peer URLs in log: / fedora201 (http:/ /127.0.0.1:7001)
Nov 25 14:31:35 fedora201 etcd: etcd server ****
Nov 25 14:31:35 fedora201 etcd: peer server ****
Nov 25 14:31:35 fedora201 etcd: fedora201 starting in peer mode
Nov 25 14:31:35 fedora201 etcd: fedora201: state changed from 'initialized' to 'follower'.
以及
Nov 25 14:31:35 fedora201 systemd: Started Kubernetes Scheduler Plugin.
Nov 25 14:31:35 fedora201 kube-scheduler: E1125 14:31:35.811700 01367 event.go:102] Could not construct reference to: &api.Pod{TypeMeta:api.TypeMeta{Ki...6355242167
Nov 25 14:31:35 fedora201 kube-scheduler: E1125 14:31:35.814115 01367 factory.go:179] Error scheduling apache: failed to find a fit for pod: {{ } {apache default...

这里etcd和peer server的地址都是127.0.0.1 这个配置链接正常吗?会不会影响minion的访问?
(所在用户组不能发外部链接,所以在//之间加了个空格)

杨章显 - 思科系统运维工程师

赞同来自:


你这样配置不工作,127.0.0.1是loopback地址,其他机器怎么能找到呢,你要配置你的etcd服务运行的IP地址

jasko

赞同来自:


谢谢 我去研究一下etcd

要回复问题请先登录注册