Summary
I am attempting to bootstrap a Kubernetes cluster on AWS using Kubeadm. Please before you suggest them, I am not interested in using EKS or another bootstrapping solution like Kops, Kubespray, etc.
It appears that there is a lot of inaccurate information about the proper procedures out there due to a schism with respect to Cloud Provider integrations not being managed out of tree rather in-tree. So I've been struggling to get a clear picture in my head about how to properly set this integration up.
The Requirements
The official repo indicates three requirements.
1) You must initialize kubelet
, kube-apiserver
, and kube-controller-manager
with the --cloud-provider=external
argument. If I understand things correctly, this allows you to use the out of tree provider. Using aws
here instead would use the in-tree provider which is on a deprecation timeline.
2) You must create two IAM policies, associate them with IAM Instance Profiles, and launch your Kubernetes nodes with said policy attached.
3) Each node in the cluster must have the same hostname that is associated with the underlying EC2 instance as its Private DNS
name.
In addition to this, I believe it was once required to attach the following Tags to your EC2 instances, Route Tables, Security Groups, and Subnets. Which I have done for good measure as well:
"kubernetes.io/cluster/${var.K8S_CLUSTER_NAME}" = "kubernetes.io/cluster/${var.K8S_CLUSTER_NAME}"
The Problem
Despite this, however, when my worker nodes come online after bootstrapping they have the following taint applied:
node.cloudprovider.kubernetes.io/uninitialized: true
This obviously implies that the nodes have not been initialized by the Cloud Provider. I'm not really sure where to go from here. There is an open request for additional instructions on how to use the Cloud Provider integration with AWS but it is currently unsatisfied.
My Configuration
You might have noticed I left a comment on that issue detailing my issue as well. Here is a summary of the details of my environment showing that I should be in compliance with the listed requirements.
1) My Kubeadm config files set the cloud provider to external
in four places
KubeletConfiguration and InitConfiguration
nodeRegistration:
kubeletExtraArgs:
cloud-provider: external
ClusterConfiguration
apiServer:
extraArgs:
cloud-provider: external
ClusterConfiguration
controllerManager:
extraArgs:
cloud-provider: external
2) My EC2 instances were launched with an instance profile with the IAM policies outlined in the README:
$> aws ec2 describe-instances --instance-ids INSTANCE.ID | jq '.Reservations[].Instances[].IamInstanceProfile[]'
"arn:aws-us-gov:iam::ACCOUNT.ID:instance-profile/PROFILE-NAME"
3) The hostnames are the EC2 Private DNS names:
$> hostname -f
ip-10-0-10-91.us-gov-west-1.compute.internal
4) The EC2 instances as well as my route tables, subnets, etc are tagged with:
"kubernetes.io/cluster/${var.K8S_CLUSTER_NAME}" = "kubernetes.io/cluster/${var.K8S_CLUSTER_NAME}"
As a result, it looks like I am in compliance with all of the requirements so I am unsure why my nodes are still left with that Taint. Any help would be greatly appreciated!
EDIT
I have updated the tags on each instance to:
"kubernetes.io/cluster/${var.K8S_CLUSTER_NAME}" = "owned"
And added this tag to each Subnet:
"kubernetes.io/role/internal-elb" = 1
This has not resolved the situation, however.
EDIT 2
A user elsewhere suggested that the issue may be that I didn't apply the RBAC and DaemonSet resources present in the manifests directory of the cloud-provider-aws
repo. After doing so using this image, I can confirm that this has NOT resolved my issue since the aws-cloud-controller-manager
appears to expect you to be using aws
not external` as per the logs produced by the pod on startup:
Generated self-signed cert in-memory
Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
Version: v0.0.0-master+$Format:%h$
WARNING: aws built-in cloud provider is now deprecated. The AWS provider is deprecated and will be removed in a future release
Building AWS cloudprovider
Zone not specified in configuration file; querying AWS metadata service
Cloud provider could not be initialized: could not init cloud provider "aws": clusterID tags did not match: "example-14150" vs "True"
EDIT 3
I built a new image using the repo as of commit 6a14c81
. It can be found here. It appears to also be using the aws
provider by default?
Cloud provider could not be initialized: could not init cloud provider "aws": clusterID tags did not match: "example-14150" vs "True"
The documentation does not mention it is required to deploy the AWS Cloud Controller Manager along with its required RBAC policies. These can be found in
/manifests
on the repo.There is not currently a published AWS Cloud Controller Manager image. So you will need to build it and host it yourself or use my image from the newest commit found here.
You will notice that
--cloud-provider=aws
is passed as an argument. Despite being the EXTERNAL cloud provider integration, it IS in fact necessary to passaws
notexternal
here.Lastly, all of your instances must also be tagged with:
"KubernetesCluster" = var.K8S_CLUSTER_NAME