i have trouble to add a CNI to a kubernetes master node, the CNI plugin does not have access to certain files or folders. The logs from Calico and Flannel say that certain files or folders are not accessable (In the post I only refer to Calico).
I found the same problem for kubectl, kubeadm and kubelet with version v1.19.4 and v1.19.3. Docker is on version 19.03.13-ce and use overlay2 with an ext4 filesystem and systemd as cgroupdriver. Swap is disabled.
The only thing that goes in that direction what I found on stackoverflow is this: Kubernetes Cluster with Calico - Containers are not coming up & failing with FailedCreatePodSandBox
In the first step I setup the cluster with kubeadm (CIDR for calico):
# kubeadm init --apiserver-advertise-address=192.168.178.33 --pod-network-cidr=192.168.0.0/16
This is workinThis is working correctly, in the kubelet logs is the message that a CNI is required. After this I am applying the CNI calico:
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
After waiting some time the master node will remain in the following state:
❯ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5c6f6b67db-zdksz 0/1 ContainerCreating 0 7m47s
kube-system calico-node-sc42z 0/1 CrashLoopBackOff 5 7m47s
kube-system coredns-f9fd979d6-4zrcj 0/1 ContainerCreating 0 8m11s
kube-system coredns-f9fd979d6-wf9r2 0/1 ContainerCreating 0 8m11s
kube-system etcd-hs-0 1/1 Running 0 8m20s
kube-system kube-apiserver-hs-0 1/1 Running 0 8m20s
kube-system kube-controller-manager-hs-0 1/1 Running 0 8m20s
kube-system kube-proxy-t6ngd 1/1 Running 0 8m11s
kube-system kube-scheduler-hs-0 1/1 Running 0 8m20sere
For me, the information I got from the following command:
kubectl describe pods calico-node-sc42z --namespace kube-system
is inconsistent with the next code: The calico-node pod has a mounted volume but also that the pod has no access to it (look ad the volumes and the events).
❯ kubectl describe pods calico-node-sc42z --namespace kube-system
Name: calico-node-sc42z
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: hs-0/192.168.178.48
Start Time: Sat, 14 Nov 2020 00:58:36 +0100
Labels: controller-revision-hash=5f678767
k8s-app=calico-node
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 192.168.178.48
IPs:
IP: 192.168.178.48
Controlled By: DaemonSet/calico-node
Init Containers:
upgrade-ipam:
Container ID: docker://29c6cf8b73ecb98ee18169db0f6ffe8b141a8a6e10b2c839fc5bf05177f066ac
Image: calico/cni:v3.16.5
Image ID: docker-pullable://calico/cni@sha256:e05d0ee834c2004e8e7c4ee165a620166cd16e3cb8204a06eb52e5300b46650b
Port: <none>
Host Port: <none>
Command:
/opt/cni/bin/calico-ipam
-upgrade
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 14 Nov 2020 00:58:48 +0100
Finished: Sat, 14 Nov 2020 00:58:48 +0100
Ready: True
Restart Count: 0
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
CALICO_NETWORKING_BACKEND: <set to the key 'calico_backend' of config map 'calico-config'> Optional: false
Mounts:
/host/opt/cni/bin from cni-bin-dir (rw)
/var/lib/cni/networks from host-local-net-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
install-cni:
Container ID: docker://4435863e0d2f3ab4535aa6ca49ff95d889e71614861f3c7c0e4213d8c333f4db
Image: calico/cni:v3.16.5
Image ID: docker-pullable://calico/cni@sha256:e05d0ee834c2004e8e7c4ee165a620166cd16e3cb8204a06eb52e5300b46650b
Port: <none>
Host Port: <none>
Command:
/opt/cni/bin/install
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 14 Nov 2020 00:58:49 +0100
Finished: Sat, 14 Nov 2020 00:58:49 +0100
Ready: True
Restart Count: 0
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
CNI_CONF_NAME: 10-calico.conflist
CNI_NETWORK_CONFIG: <set to the key 'cni_network_config' of config map 'calico-config'> Optional: false
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
CNI_MTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
SLEEP: false
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
flexvol-driver:
Container ID: docker://ca03f59013c1576a4a605a6d737af78ec3e859376aa11a301e56f0ffdacbc8db
Image: calico/pod2daemon-flexvol:v3.16.5
Image ID: docker-pullable://calico/pod2daemon-flexvol@sha256:7b20fd9cc36c7196dd24d56cc1e89ac573c634856ee020334b0b30cf5b8a3d3b
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 14 Nov 2020 00:58:56 +0100
Finished: Sat, 14 Nov 2020 00:58:56 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/host/driver from flexvol-driver-host (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
Containers:
calico-node:
Container ID: docker://96bbc7f4adf1d5cb9a927aedc18e16da7b5ed4b0ff1290179a8dd4a51c115ab8
Image: calico/node:v3.16.5
Image ID: docker-pullable://calico/node@sha256:43c145b2bd837611d8d41e70631a8f2cc2b97b5ca9d895d66ffddd414dab83c5
Port: <none>
Host Port: <none>
State: Running
Started: Sat, 14 Nov 2020 01:04:51 +0100
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Sat, 14 Nov 2020 01:03:41 +0100
Finished: Sat, 14 Nov 2020 01:04:51 +0100
Ready: False
Restart Count: 5
Requests:
cpu: 250m
Liveness: exec [/bin/calico-node -felix-live -bird-live] delay=10s timeout=1s period=10s #success=1 #failure=6
Readiness: exec [/bin/calico-node -felix-ready -bird-ready] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
DATASTORE_TYPE: kubernetes
WAIT_FOR_DATASTORE: true
NODENAME: (v1:spec.nodeName)
CALICO_NETWORKING_BACKEND: <set to the key 'calico_backend' of config map 'calico-config'> Optional: false
CLUSTER_TYPE: k8s,bgp
IP: autodetect
CALICO_IPV4POOL_IPIP: Always
CALICO_IPV4POOL_VXLAN: Never
FELIX_IPINIPMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
FELIX_VXLANMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
FELIX_WIREGUARDMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
CALICO_DISABLE_FILE_LOGGING: true
FELIX_DEFAULTENDPOINTTOHOSTACTION: ACCEPT
FELIX_IPV6SUPPORT: false
FELIX_LOGSEVERITYSCREEN: info
FELIX_HEALTHENABLED: true
Mounts:
/lib/modules from lib-modules (ro)
/run/xtables.lock from xtables-lock (rw)
/sys/fs/ from sysfs (rw)
/var/lib/calico from var-lib-calico (rw)
/var/run/calico from var-run-calico (rw)
/var/run/nodeagent from policysync (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
var-run-calico:
Type: HostPath (bare host directory volume)
Path: /var/run/calico
HostPathType:
var-lib-calico:
Type: HostPath (bare host directory volume)
Path: /var/lib/calico
HostPathType:
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
sysfs:
Type: HostPath (bare host directory volume)
Path: /sys/fs/
HostPathType: DirectoryOrCreate
cni-bin-dir:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni-net-dir:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
host-local-net-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/cni/networks
HostPathType:
policysync:
Type: HostPath (bare host directory volume)
Path: /var/run/nodeagent
HostPathType: DirectoryOrCreate
flexvol-driver-host:
Type: HostPath (bare host directory volume)
Path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
HostPathType: DirectoryOrCreate
calico-node-token-tzhr4:
Type: Secret (a volume populated by a Secret)
SecretName: calico-node-token-tzhr4
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: :NoScheduleop=Exists
:NoExecuteop=Exists
CriticalAddonsOnly op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m52s default-scheduler Successfully assigned kube-system/calico-node-sc42z to hs-0
Normal Pulling 6m51s kubelet Pulling image "calico/cni:v3.16.5"
Normal Pulled 6m40s kubelet Successfully pulled image "calico/cni:v3.16.5" in 10.618669742s
Normal Started 6m40s kubelet Started container upgrade-ipam
Normal Created 6m40s kubelet Created container upgrade-ipam
Normal Created 6m39s kubelet Created container install-cni
Normal Pulled 6m39s kubelet Container image "calico/cni:v3.16.5" already present on machine
Normal Started 6m39s kubelet Started container install-cni
Normal Pulling 6m38s kubelet Pulling image "calico/pod2daemon-flexvol:v3.16.5"
Normal Started 6m32s kubelet Started container flexvol-driver
Normal Created 6m32s kubelet Created container flexvol-driver
Normal Pulled 6m32s kubelet Successfully pulled image "calico/pod2daemon-flexvol:v3.16.5" in 6.076268177s
Normal Pulling 6m31s kubelet Pulling image "calico/node:v3.16.5"
Normal Pulled 6m19s kubelet Successfully pulled image "calico/node:v3.16.5" in 12.051211859s
Normal Created 6m19s kubelet Created container calico-node
Normal Started 6m19s kubelet Started container calico-node
Warning Unhealthy 5m32s (x5 over 6m12s) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Failed to stat() nodename file: stat /var/lib/calico/nodename: no such file or directory
Warning Unhealthy 109s (x23 over 6m9s) kubelet Liveness probe failed: calico/node is not ready: bird/confd is not live: exit status 1
Further I have the logs of the calico-node, but I do not understand how to benefit from this additional information: Unfortunately I don't know if datastore is referring to the file system, meaning this is the error I already know or if it is something additional.
❯ kubectl logs calico-node-sc42z -n kube-system -f
2020-11-14 01:42:55.536 [INFO][8] startup/startup.go 376: Early log level set to info
2020-11-14 01:42:55.536 [INFO][8] startup/startup.go 392: Using NODENAME environment for node name
2020-11-14 01:42:55.536 [INFO][8] startup/startup.go 404: Determined node name: hs-0
2020-11-14 01:42:55.539 [INFO][8] startup/startup.go 436: Checking datastore connection
2020-11-14 01:43:25.539 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
2020-11-14 01:43:56.540 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
Maybe someone can give me a hint how to solve this problem or where to read about this topic. Greetings, Kokos Bot.
It might be because you Calico's default POD CIDR conflicting with Host CIDR. Just got that impression from your
--apiserver-advertise-address=192.168.178.33
. If that is the case, worth trying out with a different POD CIDR--pod-network-cidr=20.96.0.0/12
withkubeadm init
For a clean installation again, better do a
kubeadm reset
before the above changes. Please be aware of thekubeadm reset
command impacts, before executing (Read here)Reference - https://stackoverflow.com/questions/60742165/kubernetes-calico-replicaset