kubeadm使ってmasterをHA構成にするには、kubeadm initのパラメタにLBを指定して構築する。

環境(使用バージョン)はkubeadm 1.18

[zaki@k8s-master01 ~]$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:56:30Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

1台目のmasterノード作成
2台目以降のmasterノード作成
--experimental-control-planeオプション
証明書のコピー
2時間後経過すると
kubeadm token list

1台目のmasterノード作成

zaki-hmkc.hatenablog.com

masterがシングルノードの場合(CNIにflannel使用)は

$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16

とだけ実行してたkubeadm init、これだとworkerノードからmasterサーバは、上記コマンドを実行したホストへ直接アクセスしていた。
これだと冗長化できてないので、1台止まるとサービス停止してしまうので、LBを経由して負荷分散する構成にする。
(※ 本記事ではLBの設定については扱いません。HAProxyなどで良い感じに作ってください。。)

kubeadm initでは、以下のオプションを追加できる。

kubernetes.io

--control-plane-endpoint
- Specify a stable IP address or DNS name for the control plane.
--upload-certs
- Upload control-plane certificates to the kubeadm-certs Secret.

--control-plane-endpointに、LBのアドレスを指定すればOK

$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint=k8s-master.esxi.jp-z.jp --upload-certs

これで1台目のmasterを構築、CNIを入れてノードをReady状態にしたら、2台目のmasterを作成する。
ちなみに1台目の構築時は、以下のような出力。

[zaki@k8s-master01 ~]$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint=k8s-master.esxi.jp-z.jp --upload-certs
:
:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9kqjaz.j0rr9h11wve4tn3o \
    --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1 \
    --control-plane --certificate-key eeb09b51e430da7111aa633047fe0ee12f50b52225d1cb8a21f1163157f1962d

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9kqjaz.j0rr9h11wve4tn3o \
    --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1

このようにmasterノード追加のコマンドと、workerノード追加のコマンドそれぞれ表示されるようになる。

2台目以降のmasterノード作成

1台目でkubeadm initしてCNIをインストールし、statusがReadyになったら、追加する2台目のmasterノードでkubeadm joinを実行する。
(雑に試した感じだと、2台目master構築→CNIインストールでもReadyにはなった)

[zaki@k8s-master02 ~]$ sudo kubeadm join k8s-master.esxi.jp-z.jp:6443 --token 9kqjaz.j0rr9h11wve4tn3o \
>     --discovery-token-ca-cert-hash sha256:1cff444b8d55f99f0eef7b0b0208d40db6603a690e06cc30e4ed1ec9de60f3a1 \
>     --control-plane --certificate-key eeb09b51e430da7111aa633047fe0ee12f50b52225d1cb8a21f1163157f1962d
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master02.esxi.jp-z.jp kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-master.esxi.jp-z.jp] and IPs [10.96.0.1 192.168.0.122][certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master02.esxi.jp-z.jp localhost] and IPs [192.168.0.122 127.0.0.1 ::1]
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master02.esxi.jp-z.jp localhost] and IPs [192.168.0.122 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0405 20:44:35.538462    4390 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0405 20:44:35.542118    4390 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0405 20:44:35.542954    4390 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-04-05T20:44:47.248+0900","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.0.122:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8s-master02.esxi.jp-z.jp as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master02.esxi.jp-z.jp as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

[zaki@k8s-master02 ~]$

処理が完了すると、2台目のmasterもReadyになる。

[zaki@k8s-master01 ~]$ kubectl get node
NAME                        STATUS   ROLES    AGE     VERSION
k8s-master01.esxi.jp-z.jp   Ready    master   7m16s   v1.18.0
k8s-master02.esxi.jp-z.jp   Ready    master   28s     v1.18.0

--experimental-control-planeオプション

ちなみにコントロールプレーンの追加のオプションは現在(kubeadm v1.18)では、--control-planeになっている。
昔は--experimental-control-planeだったのが変更されているので、古いドキュメントなどを見ている場合は注意。

証明書のコピー

(この辺りは想像なので認識違いしてる可能性あり)
以前のkubeadmには--upload-certsオプションが無く、2台目以降のmasterノード作成時には、構築済み1台目masterから証明書を手動でコピーしてやる必要があった。
(現在も、--upload-certsオプションを外して実行すれば証明書は手動で配布する必要がある)

The --upload-certs flag is used to upload the certificates that should be shared across all the control-plane instances to the cluster. If instead, you prefer to copy certs across control-plane nodes manually or using automation tools, please remove this flag and refer to Manual certificate distribution section below.

手動で証明書のコピーをやってて「あー、これはAnsibleで自動化するにはちょうどいい題材だなー」とplaybookも書きながら3,4回作業した後にこのオプション見つけて、軽くショックだった。。

手順の概要としては、構築済み1台目masterの以下の証明書を、追加する2台目以降のmasterの同じディレクトリに配布する、というもの。

/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd/ca.key
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/ca.key
/etc/kubernetes/pki/sa.pub
/etc/kubernetes/pki/sa.key
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/front-proxy-ca.key
/etc/kubernetes/admin.conf

要はここでやってる手動の証明書配布手順。

2時間後経過すると

As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

結果の出力に「この証明書は2時間で爆発する」と書いてある(書いてない)ので、その後に作業したい場合は表示されているコマンドを実行して証明書を再作成する必要がある。

詳細はこちら

zaki-hmkc.hatenablog.com

kubeadm token list

初期構築から2時間たってない状態だと、2件出力されてた。
1s127v.wbr2b8220imxdk42の方が、↑の2時間で動きがある件に作用するやつかな。

[zaki@k8s-master01 ~]$ kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
1s127v.wbr2b8220imxdk42   1h          2020-04-05T22:37:55+09:00   <none>                   Proxy for managing TTL for the kubeadm-certs secret        <none>
9kqjaz.j0rr9h11wve4tn3o   23h         2020-04-06T20:37:56+09:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

zaki work log

作業ログやら生活ログやらなんやら

[kubernetes] kubeadmを使ったmasterノードのHA構成クラスタ