masterノードをHA構成でクラスタデプロイしたときに使用した、以下のトークンがどちらも失効したあとに別のmasterノードをクラスタに追加したい場合。
[zaki@k8s-master01 ~]$ kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 1s127v.wbr2b8220imxdk42 1h 2020-04-05T22:37:55+09:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none> 9kqjaz.j0rr9h11wve4tn3o 23h 2020-04-06T20:37:56+09:00 authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
以下の記事の続きという位置づけ。
workerの場合はこちら。
環境
[zaki@k8s-master01 ~]$ kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:56:30Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
証明書とトークンの再作成
まずは証明書。
これは1台目のmasterのkubeadm init
の出力にもあるが、kubeadm init phase upload-certs --upload-certs
を実行する。
[zaki@k8s-master01 ~]$ kubeadm token list [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ kubeadm init phase upload-certs --upload-certs W0406 23:05:32.380415 11085 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] error execution phase upload-certs: failed to load admin kubeconfig: open /etc/kubernetes/admin.conf: permission denied To see the stack trace of this error execute with --v=5 or higher [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ sudo kubeadm init phase upload-certs --upload-certs W0406 23:05:40.248088 11158 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: d02ee82ed92e18805f81b533a7e3990447eb32ccc5d78914b49e97e7fe557b91 [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 6jvsl4.vj263t6vehh0lo8f 1h 2020-04-07T01:05:40+09:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none> [zaki@k8s-master01 ~]$
なるほど。root権限が必要だった。
出力の
[upload-certs] Using certificate key: d02ee82ed92e18805f81b533a7e3990447eb32ccc5d78914b49e97e7fe557b91
の部分がmaster追加時に必要な証明書のキーになる。
あとはノード追加用のトークンも作成する。
これはworker追加時と同じコマンド。
[zaki@k8s-master01 ~]$ kubeadm token create --print-join-command W0406 23:06:07.627249 11367 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] kubeadm join k8s-master.esxi.jp-z.jp:6443 --token w431jg.vqrtvq0lvkfnkrt3 --discovery-token-ca-cert-hash sha256:87cb5602c5a08f638e000151783837e03c8adf9f0053e77fe948dcae63a16d93 [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ [zaki@k8s-master01 ~]$ kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 6jvsl4.vj263t6vehh0lo8f 1h 2020-04-07T01:05:40+09:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none> w431jg.vqrtvq0lvkfnkrt3 23h 2020-04-07T23:06:07+09:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token [zaki@k8s-master01 ~]$
これで材料がそろった。
masterノードの追加
以下のコマンドを新しく追加したいmasterノードで実行する。
kubeadm join k8s-master.esxi.jp-z.jp:6443 --token w431jg.vqrtvq0lvkfnkrt3 \ --discovery-token-ca-cert-hash sha256:87cb5602c5a08f638e000151783837e03c8adf9f0053e77fe948dcae63a16d93 \ --control-plane --certificate-key d02ee82ed92e18805f81b533a7e3990447eb32ccc5d78914b49e97e7fe557b91
実行結果
[zaki@k8s-master03 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token w 431jg.vqrtvq0lvkfnkrt3 \ > --discovery-token-ca-cert-hash sha256:87cb5602c5a08f638e000151783837e03c8 adf9f0053e77fe948dcae63a16d93 \ > --control-plane --certificate-key d02ee82ed92e18805f81b533a7e3990447eb32c cc5d78914b49e97e7fe557b91 [sudo] zaki のパスワード: [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks before initializing the new control plane instance [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.123] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki" [certs] Using the existing "sa" key [kubeconfig] Generating kubeconfig files [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" W0406 23:08:34.152421 4500 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-controller-manager" W0406 23:08:34.157405 4500 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W0406 23:08:34.158044 4500 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [check-etcd] Checking that the etcd cluster is healthy [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... [etcd] Announced new etcd member joining to the existing etcd cluster [etcd] Creating static Pod manifest for "etcd" [etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s {"level":"warn","ts":"2020-04-06T23:08:44.107+0900","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.0.123:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"} [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [mark-control-plane] Marking the node k8s-master03.esxi.jp-z.jp as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node k8s-master03.esxi.jp-z.jp as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster. [zaki@k8s-master03 ~]$
ノードが追加されてることの確認
設定済みの(1台目の)masterノードで状態を確認すると
[zaki@k8s-master01 ~]$ kubectl get node NAME STATUS ROLES AGE VERSION k8s-master01.esxi.jp-z.jp Ready master 12m v1.18.0 k8s-master02.esxi.jp-z.jp Ready master 6m33s v1.18.0 k8s-master03.esxi.jp-z.jp Ready master 26s v1.18.0 k8s-worker01.esxi.jp-z.jp Ready <none> 5m42s v1.18.0
ちゃんと追加されました。
補足:ノードを削除して再作成するとうまくいかなかった(未解決)
↑の実行結果、AGE
を見ると作り直してるのが丸わかりだけど、実は「master3を作成・ノード追加」したあとに、再実行したくて一度「kubectl delete node
で削除」「VM削除して作り直し」「ホスト名・アドレスを同じものを設定して再構築」すると、うまくいかなかった。
master3再構築時のログ
[zaki@k8s-master03 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9tdgi1.d7u993ca6dlxktox \ > --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1 \ > --control-plane --certificate-key e89f5d59b75d1ff2b5470c7edd9c4f69e31ba1d93003ebd8bef6e9bf2f3b4854 [sudo] zaki のパスワード: [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks before initializing the new control plane instance [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.123][certs] Generating "front-proxy-client" certificate and key [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki" [certs] Using the existing "sa" key [kubeconfig] Generating kubeconfig files [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" W0406 21:59:42.511844 12535 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-controller-manager" W0406 21:59:42.517133 12535 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W0406 21:59:42.517733 12535 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [check-etcd] Checking that the etcd cluster is healthy error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://192.168.0.123:2379 with maintenance client: context deadline exceeded To see the stack trace of this error execute with --v=5 or higher
etcdがhealthy
と出てるのに、(だいぶ時間がかかった後に)エラーとなった。
context deadline exceeded
ってなんだ。期限切れ?
master3のかわりにアドレスも異なるmaster4を作成
よくわからないので、master3とそのIPアドレス(192.168.0.123)は破棄して、新しくmaster4を作成(アドレスは192.168.0.124)。
だがしかし
[zaki@k8s-master04 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9tdgi1.d7u993ca6dlxktox \ > --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1 \ > --control-plane --certificate-key e89f5d59b75d1ff2b5470c7edd9c4f69e31ba1d93003ebd8bef6e9bf2f3b4854 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [preflight] Running pre-flight checks before initializing the new control plane instance [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master04.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.124][certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master04.esxi.jp-z.jp localhost] and IPs [192.168.0.124 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master04.esxi.jp-z.jp localhost] and IPs [192.168.0.124 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki" [certs] Using the existing "sa" key [kubeconfig] Generating kubeconfig files [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" W0406 22:33:43.576023 4400 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-controller-manager" W0406 22:33:43.580045 4400 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W0406 22:33:43.580631 4400 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [check-etcd] Checking that the etcd cluster is healthy error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://192.168.0.123:2379 with maintenance client: context deadline exceeded To see the stack trace of this error execute with --v=5 or higher
同じエラーというか、繋ごうとしてるアドレスが192.168.0.123
で、master4(192.168.0.124)でなく旧master3の設定を見ている。なんか前の情報が残ってるかもしれんない。