zaki work log

作業ログやら生活ログやらなんやら

[kubernetes] トークン・証明書の失効後のkubeadmを使ったmasterノードの追加

masterノードをHA構成でクラスタデプロイしたときに使用した、以下のトークンがどちらも失効したあとに別のmasterノードをクラスタに追加したい場合。

[zaki@k8s-master01 ~]$ kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
1s127v.wbr2b8220imxdk42   1h          2020-04-05T22:37:55+09:00   <none>                   Proxy for managing TTL for the kubeadm-certs secret        <none>
9kqjaz.j0rr9h11wve4tn3o   23h         2020-04-06T20:37:56+09:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

以下の記事の続きという位置づけ。

zaki-hmkc.hatenablog.com

workerの場合はこちら。

zaki-hmkc.hatenablog.com

環境

[zaki@k8s-master01 ~]$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:56:30Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

証明書とトークンの再作成

まずは証明書。
これは1台目のmasterのkubeadm initの出力にもあるが、kubeadm init phase upload-certs --upload-certsを実行する。

[zaki@k8s-master01 ~]$ kubeadm token list
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ kubeadm init phase upload-certs --upload-certs
W0406 23:05:32.380415   11085 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
error execution phase upload-certs: failed to load admin kubeconfig: open /etc/kubernetes/admin.conf: permission denied
To see the stack trace of this error execute with --v=5 or higher
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ sudo kubeadm init phase upload-certs --upload-certs
W0406 23:05:40.248088   11158 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
d02ee82ed92e18805f81b533a7e3990447eb32ccc5d78914b49e97e7fe557b91
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
6jvsl4.vj263t6vehh0lo8f   1h          2020-04-07T01:05:40+09:00   <none>                   Proxy for managing TTL for the kubeadm-certs secret        <none>
[zaki@k8s-master01 ~]$ 

なるほど。root権限が必要だった。
出力の

[upload-certs] Using certificate key:
d02ee82ed92e18805f81b533a7e3990447eb32ccc5d78914b49e97e7fe557b91

の部分がmaster追加時に必要な証明書のキーになる。

あとはノード追加用のトークンも作成する。
これはworker追加時と同じコマンド。

[zaki@k8s-master01 ~]$ kubeadm token create --print-join-command
W0406 23:06:07.627249   11367 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
kubeadm join k8s-master.esxi.jp-z.jp:6443 --token w431jg.vqrtvq0lvkfnkrt3     --discovery-token-ca-cert-hash sha256:87cb5602c5a08f638e000151783837e03c8adf9f0053e77fe948dcae63a16d93 
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ 
[zaki@k8s-master01 ~]$ kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
6jvsl4.vj263t6vehh0lo8f   1h          2020-04-07T01:05:40+09:00   <none>                   Proxy for managing TTL for the kubeadm-certs secret        <none>
w431jg.vqrtvq0lvkfnkrt3   23h         2020-04-07T23:06:07+09:00   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token
[zaki@k8s-master01 ~]$ 

これで材料がそろった。

masterノードの追加

以下のコマンドを新しく追加したいmasterノードで実行する。

kubeadm join k8s-master.esxi.jp-z.jp:6443 --token w431jg.vqrtvq0lvkfnkrt3 \
    --discovery-token-ca-cert-hash sha256:87cb5602c5a08f638e000151783837e03c8adf9f0053e77fe948dcae63a16d93 \
    --control-plane --certificate-key d02ee82ed92e18805f81b533a7e3990447eb32ccc5d78914b49e97e7fe557b91

実行結果

[zaki@k8s-master03 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token w
431jg.vqrtvq0lvkfnkrt3 \
>     --discovery-token-ca-cert-hash sha256:87cb5602c5a08f638e000151783837e03c8
adf9f0053e77fe948dcae63a16d93 \
>     --control-plane --certificate-key d02ee82ed92e18805f81b533a7e3990447eb32c
cc5d78914b49e97e7fe557b91
[sudo] zaki のパスワード:
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.123]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0406 23:08:34.152421    4500 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0406 23:08:34.157405    4500 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0406 23:08:34.158044    4500 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-04-06T23:08:44.107+0900","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.0.123:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8s-master03.esxi.jp-z.jp as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master03.esxi.jp-z.jp as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

[zaki@k8s-master03 ~]$ 

ノードが追加されてることの確認

設定済みの(1台目の)masterノードで状態を確認すると

[zaki@k8s-master01 ~]$ kubectl get node 
NAME                        STATUS   ROLES    AGE     VERSION
k8s-master01.esxi.jp-z.jp   Ready    master   12m     v1.18.0
k8s-master02.esxi.jp-z.jp   Ready    master   6m33s   v1.18.0
k8s-master03.esxi.jp-z.jp   Ready    master   26s     v1.18.0
k8s-worker01.esxi.jp-z.jp   Ready    <none>   5m42s   v1.18.0

ちゃんと追加されました。

補足:ノードを削除して再作成するとうまくいかなかった(未解決)

↑の実行結果、AGEを見ると作り直してるのが丸わかりだけど、実は「master3を作成・ノード追加」したあとに、再実行したくて一度「kubectl delete nodeで削除」「VM削除して作り直し」「ホスト名・アドレスを同じものを設定して再構築」すると、うまくいかなかった。

master3再構築時のログ

[zaki@k8s-master03 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9tdgi1.d7u993ca6dlxktox \
>     --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1 \
>     --control-plane --certificate-key e89f5d59b75d1ff2b5470c7edd9c4f69e31ba1d93003ebd8bef6e9bf2f3b4854
[sudo] zaki のパスワード:
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp localhost] and IPs [192.168.0.123 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master03.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.123][certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0406 21:59:42.511844   12535 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0406 21:59:42.517133   12535 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0406 21:59:42.517733   12535 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://192.168.0.123:2379 with maintenance client: context deadline exceeded
To see the stack trace of this error execute with --v=5 or higher

etcdがhealthyと出てるのに、(だいぶ時間がかかった後に)エラーとなった。

context deadline exceeded

ってなんだ。期限切れ?

master3のかわりにアドレスも異なるmaster4を作成

よくわからないので、master3とそのIPアドレス(192.168.0.123)は破棄して、新しくmaster4を作成(アドレスは192.168.0.124)。
だがしかし

[zaki@k8s-master04 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9tdgi1.d7u993ca6dlxktox \
>     --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1 \
>     --control-plane --certificate-key e89f5d59b75d1ff2b5470c7edd9c4f69e31ba1d93003ebd8bef6e9bf2f3b4854
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master04.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.124][certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master04.esxi.jp-z.jp localhost] and IPs [192.168.0.124 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master04.esxi.jp-z.jp localhost] and IPs [192.168.0.124 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0406 22:33:43.576023    4400 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0406 22:33:43.580045    4400 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0406 22:33:43.580631    4400 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
error execution phase check-etcd: etcd cluster is not healthy: failed to dial endpoint https://192.168.0.123:2379 with maintenance client: context deadline exceeded
To see the stack trace of this error execute with --v=5 or higher

同じエラーというか、繋ごうとしてるアドレスが192.168.0.123で、master4(192.168.0.124)でなく旧master3の設定を見ている。なんか前の情報が残ってるかもしれんない。