From my workstation I can list pods, check component status etc.
[dude@bionic K8s]$ kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-2 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
[dude@bionic K8s]$ kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.32.0.1 <none> 443/TCP 41d
kube-system kube-dns ClusterIP 10.32.0.10 <none> 53/UDP,53/TCP,9153/TCP 49m
[dude@bionic K8s]$ kubectl get pod -n kube-system coredns-574fc576d6-4hbl4
NAME READY STATUS RESTARTS AGE
coredns-574fc576d6-4hbl4 0/1 CrashLoopBackOff 13 46m
But when I want to view logs (to debug the CrashLoop above), I get a certificate error:
[dude@bionic K8s]$ kubectl logs -n kube-system coredns-574fc576d6-4hbl4
plugin/kubernetes: Get "https://10.32.0.1:443/version?timeout=32s": x509: certificate is valid for 172.16.68.221, 172.16.68.222, 172.16.68.223, 172.16.68.69, 127.0.0.1, not 10.32.0.1
Why is this? My setup is a haproxy in front of three control nodes. The haproxy is on 172.16.68.69, which is configured in my kubeconfig:
apiVersion: v1
clusters:
- cluster:
server: https://172.16.68.69:6443
It's like for some reason 'kubectl logs' decides to discover the ClusterIP somehow and use that directly, while other kubectl commands use the properly configured server.
I solved it. The problem was that I had not included 10.32.0.1 in the kubernetes.pem cert. I hadn't though I needed to do this but I did, presumably as connections initiated by the apiserver don't go through the haproxy, only inbound connections do.
I recreated kubernetes.pem and kubernetes-key.pem including the cluster IP as a kubernetes hostname, then distributed these certs to /etc/etcd and /var/lib/kubernetes on the controllers and restarted kube-apiserver. All good now :)
(this also solved the crashloop issue too)