zaki work log

作業ログやら生活ログやらなんやら

oc runでPod単体デプロイするとoc adm drainで退避できないので注意 (作業ログ)

oc run --restart=NeverでPod単体で動作させると、oc adm drainでノードをサービスアウトさせようとしても、Podが退避されないのでその様子の実行ログ。
この記事の補足的な内容。

zaki-hmkc.hatenablog.com

環境

[zaki@okd4-manager ~]$ oc version
Client Version: 4.4.0-0.okd-2020-01-28-022517
Server Version: 4.4.0-0.okd-2020-01-28-022517
Kubernetes Version: v1.17.1

ノード構成

[zaki@okd4-manager ~]$ oc get node -o wide
NAME           STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION           CONTAINER-RUNTIME
okd4-master0   Ready    master   10d   v1.17.1   172.16.0.10   <none>        Fedora CoreOS 31.20200127.20.1   5.4.13-201.fc31.x86_64   cri-o://1.17.0-rc1
okd4-master1   Ready    master   10d   v1.17.1   172.16.0.11   <none>        Fedora CoreOS 31.20200127.20.1   5.4.13-201.fc31.x86_64   cri-o://1.17.0-rc1
okd4-master2   Ready    master   10d   v1.17.1   172.16.0.12   <none>        Fedora CoreOS 31.20200127.20.1   5.4.13-201.fc31.x86_64   cri-o://1.17.0-rc1
okd4-worker0   Ready    worker   10d   v1.17.1   172.16.0.20   <none>        Fedora CoreOS 31.20200127.20.1   5.4.13-201.fc31.x86_64   cri-o://1.17.0-rc1
okd4-worker1   Ready    worker   10d   v1.17.1   172.16.0.21   <none>        Fedora CoreOS 31.20200127.20.1   5.4.13-201.fc31.x86_64   cri-o://1.17.0-rc1

Pod状態

[zaki@okd4-manager ~]$ oc get pod -o wide
NAME                                   READY   STATUS      RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
dc-run-pod-1-deploy                    0/1     Completed   0          62m   10.129.0.20   okd4-worker0   <none>           <none>
dc-run-pod-1-ht77f                     1/1     Running     0          62m   10.128.2.19   okd4-worker1   <none>           <none>
dc-run-with-svc-1-deploy               0/1     Completed   0          54m   10.129.0.22   okd4-worker0   <none>           <none>
dc-run-with-svc-1-wddbs                1/1     Running     0          53m   10.128.2.20   okd4-worker1   <none>           <none>
deploy-run-pod-56c6698fdb-dzk2g        1/1     Running     0          60m   10.129.0.21   okd4-worker0   <none>           <none>
deploy-run-with-svc-5ffbb8d9c7-h2j28   1/1     Running     0          51m   10.129.0.23   okd4-worker0   <none>           <none>
never-restart-pod                      1/1     Running     0          67m   10.128.2.18   okd4-worker1   <none>           <none>

never-restart-podが単体で動いてるPod(親リソース無し)
これがデプロイされてるノード(okd4-worker1)をdrainする。

drain

[zaki@okd4-manager ~]$ oc adm drain okd4-worker1 --ignore-daemonsets --delete-local-data 
node/okd4-worker1 cordoned
evicting pod "alertmanager-main-1"
evicting pod "downloads-cf5d8d7f4-b9bfg"
evicting pod "csi-snapshot-controller-f44f9d4b4-fvwv2"
evicting pod "alertmanager-main-0"
evicting pod "prometheus-adapter-d48bc96cc-g8nwl"
evicting pod "grafana-66d8c446bf-p75bt"
evicting pod "kube-state-metrics-84bf857f94-fpcc8"
evicting pod "telemeter-client-954595b4d-cfptr"
evicting pod "prometheus-k8s-0"
evicting pod "dc-run-pod-1-ht77f"
evicting pod "dc-run-with-svc-1-wddbs"
pod/csi-snapshot-controller-f44f9d4b4-fvwv2 evicted
pod/prometheus-k8s-0 evicted
pod/telemeter-client-954595b4d-cfptr evicted
pod/alertmanager-main-0 evicted
pod/alertmanager-main-1 evicted
pod/dc-run-pod-1-ht77f evicted
pod/downloads-cf5d8d7f4-b9bfg evicted
pod/grafana-66d8c446bf-p75bt evicted
pod/kube-state-metrics-84bf857f94-fpcc8 evicted
pod/prometheus-adapter-d48bc96cc-g8nwl evicted
pod/dc-run-with-svc-1-wddbs evicted
node/okd4-worker1 evicted
[zaki@okd4-manager ~]$ 

drainの結果

[zaki@okd4-manager ~]$ oc get pod -o wide
NAME                                   READY   STATUS      RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
dc-run-pod-1-deploy                    0/1     Completed   0          65m   10.129.0.20   okd4-worker0   <none>           <none>
dc-run-pod-1-zjslf                     1/1     Running     0          99s   10.129.0.30   okd4-worker0   <none>           <none>
dc-run-with-svc-1-deploy               0/1     Completed   0          56m   10.129.0.22   okd4-worker0   <none>           <none>
dc-run-with-svc-1-tqf44                1/1     Running     0          99s   10.129.0.26   okd4-worker0   <none>           <none>
deploy-run-pod-56c6698fdb-dzk2g        1/1     Running     0          63m   10.129.0.21   okd4-worker0   <none>           <none>
deploy-run-with-svc-5ffbb8d9c7-h2j28   1/1     Running     0          54m   10.129.0.23   okd4-worker0   <none>           <none>
never-restart-pod                      1/1     Running     0          70m   10.128.2.18   okd4-worker1   <none>           <none>

この通り、drainしてもevictされない。

--forceを付けるとと強制的にevictされるが、停止するだけで別ノードで再デプロイはされない。

[zaki@okd4-manager ~]$ oc adm uncordon okd4-worker1
node/okd4-worker1 uncordoned
[zaki@okd4-manager ~]$ oc adm drain okd4-worker1 --ignore-daemonsets --delete-local-data --force 
node/okd4-worker1 cordoned
WARNING: deleting Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: run-sample/never-restart-pod
evicting pod "alertmanager-main-0"
evicting pod "router-default-5698995cc8-b7k5f"
evicting pod "migrator-5746b7d596-vnrf8"
evicting pod "grafana-66d8c446bf-8ws9l"
evicting pod "prometheus-k8s-0"
evicting pod "never-restart-pod"
pod/alertmanager-main-0 evicted
pod/grafana-66d8c446bf-8ws9l evicted
pod/router-default-5698995cc8-b7k5f evicted
pod/prometheus-k8s-0 evicted
pod/migrator-5746b7d596-vnrf8 evicted
pod/never-restart-pod evicted
node/okd4-worker1 evicted
[zaki@okd4-manager ~]$ oc get pod -o wide
NAME                                   READY   STATUS      RESTARTS   AGE     IP            NODE           NOMINATED NODE   READINESS GATES
dc-run-pod-1-deploy                    0/1     Completed   0          70m     10.129.0.20   okd4-worker0   <none>           <none>
dc-run-pod-1-zjslf                     1/1     Running     0          6m53s   10.129.0.30   okd4-worker0   <none>           <none>
dc-run-with-svc-1-deploy               0/1     Completed   0          61m     10.129.0.22   okd4-worker0   <none>           <none>
dc-run-with-svc-1-tqf44                1/1     Running     0          6m53s   10.129.0.26   okd4-worker0   <none>           <none>
deploy-run-pod-56c6698fdb-dzk2g        1/1     Running     0          68m     10.129.0.21   okd4-worker0   <none>           <none>
deploy-run-with-svc-5ffbb8d9c7-h2j28   1/1     Running     0          59m     10.129.0.23   okd4-worker0   <none>           <none>

ノードにCompletedなPodが残っている場合、drainによって消える。
また、OCPのバージョンによってはエラーになる。


k8sでも同じだと思うけど、よく考えたら自由に使えるマルチノードk8sクラスタ持ってなかった… 作るか…


ふと思ったけど、restartPolicy: Never以外の設定でPod単体で動かせたりできるのかな…?