使用kubeadm在Red Hat 7/CentOS 7快速部署Kubernetes 1.7集群


前言

kubeadm是Kubernetes官方推出的快速部署Kubernetes集群工具,其思路是将Kubernetes相关服务容器化(Kubernetes静态Pod)以简化部署。kubeadm当前处于beta阶段,不建议生产环境使用(比如etcd单点)。使用kubeadm部署Kubernetes集群非常简单方便,本文记录了在Red Hat 7上用kubeadm部署一个安全的Kubernetes集群的全过程,CentOS 7部署过程与之类似。

如需kubeadm部署一个高可用的集群,可采用如下方式
kubeadm init --api-advertise-addresses=vip --external-etcd-endpoints=http://x.x.x.x:2379,http://x.x.x.x:2379,http://x.x.x.x:2379 --pod-network-cidr 10.244.0.0/16

注:--api-advertise-addresses 本身支持多个api service ip,但操作kubeadm join加入集群节点会失败, 所以对外服务只配置为一个vip。

部署实例

etcd Version: 3.0.17
kubeadm: v1.7.1
kubernetes: v1.7.1
Flannel: v0.8.0
Docker: 17.03.1-ce

准备工作

1. Red Hat Enterprise Linux Server release 7.1 (Maipo)( 1GB+ RAM)
2. 集群机器之间网络互通

目标

1. 部署一个安全的Kubernetes v1.7.1集群
2. 部署pod网络以便pod之间可以互通

部署步骤

安装docker
Kubeadm目前并未在Docker 1.13,17.03+等高级版本进行验证,Kubernetes官方推荐的Docker1.10,1.11,1.12版本。对于Centos,推荐Docker Storage Driver 采用Devicemapper的direct-lvm模式。Docker以及direct-vm参见Docker官网。
Docker安装请参见官网:https://docs.docker.com/engine ... ntos/
通过docker version查看docker版本:
sudo docker version
Client:
Version:      17.03.1-ce
API version:  1.27
Go version:   go1.7.5
Git commit:   c6d412e
Built:        Mon Mar 27 17:05:44 2017
OS/Arch:      linux/amd64

Server:
Version:      17.03.1-ce
API version:  1.27 (minimum version 1.12)
Go version:   go1.7.5
Git commit:   c6d412e
Built:        Mon Mar 27 17:05:44 2017
OS/Arch:      linux/amd64
Experimental: false

direct-lvm安装详见官网,在此略过: https://docs.docker.com/engine ... iver/
... ...
Server Version: 17.03.1-ce
Storage Driver: devicemapper
Pool Name: docker-thinpool
Pool Blocksize: 524.3 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 1.429 GB
Data Space Total: 306 GB
Data Space Available: 304.6 GB
Metadata Space Used: 782.3 kB
Metadata Space Total: 3.217 GB
Metadata Space Available: 3.216 GB
Thin Pool Minimum Free Space: 30.6 GB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host null bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
... ...

安装kubelet,kubeadm
1. 添加yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
    https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

2. 关闭SELinux
sudo setenforce 0

3. yum安装 kubelet kubeadm
sudo yum install -y kubelet kubeadm

或按需安装指定版本,查看kubeadm, kubelet, kubernets-cni版本:
yum list [kubeadm|kubelet|kubernets-cni]  --showduplicates |sort -r

4. 编辑/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,修改 "cgroup-driver"值 由systemd变为cgroupfs
KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs

5. Enable kubelet后启动kubelet
sudo systemctl enable kubelet && sudo systemctl start kubelet

6. 检查kubelet是否启动成功: sudo systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
       └─10-kubeadm.conf
Active: active (running) since Sat 2017-07-15 01:22:15 UTC; 20min ago
 Docs: http://kubernetes.io/docs/
Main PID: 16755 (kubelet)
Memory: 41.6M
CGroup: /system.slice/kubelet.service
       ├─16755 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --ne...
       └─16891 journalctl -k -f

注:systemctl start kubelet kubelet命令一定要执行,可能会因为如下原因无法启动,不用担心,执行kubeadm init或kubeadm join命令后kubelet会自动被systemd重启成功。
error: failed to run Kubelet: invalid kubeconfig: stat /etc/kubernetes/kubelet.conf: no such file or directory

初始化集群

1. 使用sudo kubeadm init命令初始化集群,可以指定Kubernetes master IP:--apiserver-advertise-address=<ip-address>, 如果选择Flannel做为Pod网络,需指定specify --pod-network-cidr=10.244.0.0/16. 示例如下:
sudo kubeadm init --apiserver-advertise-address 192.168.17.139 --pod-network-cidr 10.244.0.0/16

kubeadm init会做一系列的预检查已确保满足集群部署条件,检查过程大概会执行几分钟。
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.1
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.03.1-ce. Max validated version: 1.12
[preflight] WARNING: docker service is not enabled, please run 'systemctl enable docker.service'
[preflight] Starting the kubelet service
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [bjo-ep-dep-039.dev.fwmrm.net kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.17.139]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 31.001311 seconds
[token] Using token: 472def.6bbb304791b76492
[apiconfig] Created RBAC rules
[addons] Applied essential addon: kube-proxy
[addons] Applied essential addon: kube-dns

Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/

You can now join any number of machines by running the following on each node as root:
kubeadm join --token 472def.6bbb304791b76492 192.168.17.139:6443

Kubernetes根据在/etc/kubernetes/manifests目录下的manifests生成API server, controller manager and scheduler等静态pod。
sudo ls /etc/kubernetes/manifests
etcd.yaml  kube-apiserver.yaml  kube-controller-manager.yaml  kube-scheduler.yaml

kube-apiserver.yaml文件内容为例,可以查看到启动kube-apiserver需要的参数(可定制,修改该文件即可,kubelet会监控该文件变化,一旦变化会立即重新生成pod)、image、health check探针、QoS等配置,也可以把image提前下载下来加快部署速度。
spec:
containers:
- command:
- kube-apiserver
- --experimental-bootstrap-token-auth=true
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --requestheader-allowed-names=front-proxy-client
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
- --secure-port=6443
- --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota
- --requestheader-group-headers=X-Remote-Group
- --allow-privileged=true
- --requestheader-username-headers=X-Remote-User
- --requestheader-extra-headers-prefix=X-Remote-Extra-
- --service-account-key-file=/etc/kubernetes/pki/sa.pub
- --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
- --insecure-port=0
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
- --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
- --service-cluster-ip-range=10.96.0.0/12
- --authorization-mode=Node,RBAC
- --advertise-address=192.168.17.139
- --etcd-servers=http://127.0.0.1:2379
- --service-node-port-range=20000-65535
image: gcr.io/google_containers/kube-apiserver-amd64:v1.7.1
livenessProbe:
  failureThreshold: 8
  httpGet:
    host: 127.0.0.1
    path: /healthz
    port: 6443
    scheme: HTTPS
  initialDelaySeconds: 15
  timeoutSeconds: 15
name: kube-apiserver
resources:
  requests:
    cpu: 250m

kubectl get pods --all-namespaces
NAMESPACE     NAME                                                   READY     STATUS    RESTARTS   AGE
kube-system   etcd-bjo-ep-dep-039.dev.fwmrm.net                      1/1       Running   0          2h
kube-system   kube-apiserver-bjo-ep-dep-039.dev.fwmrm.net            1/1       Running   0          2h
kube-system   kube-controller-manager-bjo-ep-dep-039.dev.fwmrm.net   1/1       Running   0          2h
kube-system   kube-dns-2425271678-8k4dn                              0/3       Pending   0          2h
kube-system   kube-proxy-vd39t                                       1/1       Running   0          2h
kube-system   kube-scheduler-bjo-ep-dep-039.dev.fwmrm.net            1/1       Running   0          2h

kube-apiserver yaml文件kube-apiserver.yaml中的选项--insecure-port设置为0,说明kube-apiserver并未监听默认http 8080端口,而是监听了https 6443端口
sudo netstat -nltp | grep 6443
tcp        0      0 0.0.0.0:6443            0.0.0.0:*               LISTEN      20936/kube-apiserve 

注:从上面信息可知kube-dns已经以pod形式运行但处于pending状态,主要因为pod网络flannel还未部署,另外因下文中的Master Isolation特性导致kube-dns无节点可部署。加入节点以及解除Master Isolation均可以使kube-dns成功运行、处于running状态
2. kubeadm init命令最后生成的“kubeadm join --token 472def.6bbb304791b76492 192.168.17.139:6443”需要记录,用于节点加入集群,token也可通过 sudo kubeadm token list获取。
!769 $ sudo kubeadm token list
TOKEN                     TTL         EXPIRES   USAGES                   DESCRIPTION
472def.6bbb304791b76492   <forever>   <never>   authentication,signing   The default bootstrap token generated by 'kubeadm init'.

注:如果之前用'sudo kubeadm init'或'sudo kubeadm join'.已经初始化过集群或加过节点,预检查会失败,需用sudo kubeadm reset命令来revert:
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/lib/etcd]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

3. kubeadm init命令执行完毕后,需执行如下命令(注:mkdir不要加sudo,$HOME不需要做替换变量替换)
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Master Isolation

由于安全原因,默认情况下pod不会被schedule到master节点上,可以通过下面命令解除这种限制:kubectl taint nodes --all node-role.kubernetes.io/master
kubectl taint nodes --all node-role.kubernetes.io/master-
node "bjo-ep-dep-039.dev.fwmrm.net" untainted

安装pod网络Flannel

1.通过如下2条命令执安装Flannel:
kubectl apply -f https://raw.githubusercontent. ... c.yml
kubectl apply -f https://raw.githubusercontent. ... l.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
clusterrole "flannel" created
clusterrolebinding "flannel" created

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created

2. 验证dns是否工作
创建busybox.yml,内容如下:
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- image: busybox
command:
  - sleep
  - "3600"
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Always

通过kubectl create -f busybox.yaml创建pod,并验证通过kubectl exec -ti busybox -- nslookup kubernetes.default验证dns是否工作。
kubectl exec -ti busybox -- nslookup kubernetes.default
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes.default
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

加入节点到集群

1. 使用前面kubeadm init产生的token加入节点sudo kubeadm join 192.168.17.139:6443 --token 472def.6bbb304791b76492
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: hostname "" could not be reached
[preflight] WARNING: hostname "" lookup : no such host
[preflight] Some fatal errors occurred:
hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
[preflight] If you know what you are doing, you can skip pre-flight checks with `--skip-preflight-checks`

预检查报了一个Fatal错误,这应该是kubeadm1.7的一个bug,可用--skip-preflight-checks取消预检查
sudo kubeadm join --skip-preflight-checks --token 472def.6bbb304791b76492 192.168.17.139:6443
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Skipping pre-flight checks
[discovery] Trying to connect to API Server "192.168.17.139:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.17.139:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.17.139:6443"
[discovery] Successfully established connection with API Server "192.168.17.139:6443"
[bootstrap] Detected server version: v1.7.1
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"

Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.

Run 'kubectl get nodes' on the master to see this machine join.

注:node上需要安装kubeadm,kubelet安装并启动,否则join提示成功但实际加入失败,前文已有说明。
2. 通过kubectl get nodes查看节点是否成功加入集群
kubectl get nodes
NAME                           STATUS    AGE       VERSION
bjo-ep-dep-039.dev.fwmrm.net   Ready     1d        v1.7.1
bjo-ep-dep-040.dev.fwmrm.net   Ready     19m       v1.7.1
bjo-ep-svc-017.dev.fwmrm.net   Ready     1h        v1.7.1 

3. 测试集群是否正常工作
kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes.github.io/master/docs/concepts/workloads/controllers/nginx-deployment.yaml

通过kubectl get po -o wide查看,3个Nginx实例分别部署到3个node上
kubectl get po -o wide
NAME                               READY     STATUS    RESTARTS   AGE       IP            NODE
nginx-deployment-431080787-2z167   1/1       Running   0          3m        10.244.0.15   bjo-ep-dep-039.dev.fwmrm.net
nginx-deployment-431080787-55fl8   1/1       Running   0          3m        10.16.103.5   bjo-ep-svc-017.dev.fwmrm.net
nginx-deployment-431080787-bcmfx   1/1       Running   0          3m        10.16.103.4   bjo-ep-svc-017.dev.fwmrm.net

集群外访问集群

默认情况下因安全原因,集群外部不可以直接操作集群,如需要集群外部操作,可通过如下方式
scp root@<master ip>:/etc/kubernetes/admin.conf .
kubectl --kubeconfig ./admin.conf get nodes

示例如下:
xiazhang-mac:~ xiazhang$ kubectl --kubeconfig ./admin.conf -n kube-system  get nodes
NAME                           STATUS    AGE       VERSION
bjo-ep-dep-039.dev.fwmrm.net   Ready     2d        v1.7.1
bjo-ep-dep-040.dev.fwmrm.net   Ready     22h       v1.7.1
bjo-ep-svc-017.dev.fwmrm.net   Ready     23h       v1.7.1

或通过kubectl proxy设置代理:
scp root@<master ip>:/etc/kubernetes/admin.conf .
kubectl --kubeconfig ./admin.conf proxy

示例如下:
xiazhang-mac:~ xiazhang$ kubectl --kubeconfig ./admin.conf proxy
Starting to serve on 127.0.0.1:8001
kubectl config set-cluster default-cluster --server=http://localhost:8001

另开一个Terminal,执行 kubectl get nodes
xiazhang-mac:~ xiazhang$ kubectl get nodes
NAME                           STATUS    AGE       VERSION
bjo-ep-dep-039.dev.fwmrm.net   Ready     2d        v1.7.1
bjo-ep-dep-040.dev.fwmrm.net   Ready     22h       v1.7.1
bjo-ep-svc-017.dev.fwmrm.net   Ready     23h       v1.7.1

Kubernetes Dashboard

1. 使用官方yml文件 https://github.com/kubernetes/dashboard,增加NodePort暴露service端口,本文以31000为例。
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort
ports:
- port: 80
targetPort: 9090
nodePort: 31000
selector:
k8s-app: kubernetes-dashboard

2. Kubernetes 1.6开始API Server启用了RBAC授权,kubernetes-dashboard.yaml中并未定义需要授权的ServiceAccount,如果访问Dashborad:http://ClusterIP:NodePort会被拒绝。
User "system:serviceaccount:kube-system:default" cannot list statefulsets.apps in the namespace "default". (get statefulsets.apps)

定义dashboard-rbac.yaml,并kubectl create -f dashboard-rbac.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: dashboard-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin 
subjects:
- kind: ServiceAccount
name: default
namespace: kube-system

3. 部署Heapster in a Kubernetes
首先下载如下目录文件
https://github.com/kubernetes/heapster/blob/master/deploy/kube-config/rbac/heapster-rbac.yaml
https://github.com/kubernetes/heapster/tree/master/deploy/kube-config/influxdb

拷贝heapster-rbac.yaml 到influxdb文件夹下,然后执行:
kubectl create -f deploy/kube-config/influxdb

注:部署grafana如报如下错误
Starting a utility program that will configure Grafana
Starting Grafana in foreground mode
t=2017-07-17T07:28:47+0000 lvl=crit msg="Failed to parse /etc/grafana/grafana.ini, open /etc/grafana/grafana.ini: no such file or directory%!(EXTRA []interface {}=[])"

可通过替换image解决,编辑deploy/kube-config/influxdb/grafana.yaml
spec:
  containers:
  - name: grafana
    #image: gcr.io/google_containers/heapster-grafana-amd64:v4.2.0
    image: gcr.io/google_containers/heapster-grafana-amd64:v4.0.2

4. 访问Dashboard UI: http://192.168.17.139:31000/%2 ... D_all
12345.jpg

集群部署到此完成,有问题欢迎留言交流。

参考资料

https://kubernetes.io/docs/set ... eadm/
https://kubernetes.io/docs/admin/kubeadm/
https://kubernetes.io/docs/set ... eadm/
https://kubernetes.io/docs/tas ... ectl/
https://stackoverflow.com/ques ... board
http://www.cnblogs.com/caiwenhao/p/6196014.html

欢迎转载,请注明作者出处:张夏,FreeWheel Lead Engineer,DockerOne社区

0 个评论

要回复文章请先登录注册