I have set up an AWS VPC
and am trying to deploy a functional container in ECS
on a Fargate launch type
but the task always fails with:
STOPPED (CannotPullContainerError: Error response from daem)
Task role context:
ecsTaskExecutionRole
Which has the following IAM permissions:
The repo permissions are such:
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPull",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::aws_account_id:role/ecsTaskExecutionRole"
},
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:BatchGetImage",
"ecr:DescribeImages",
"ecr:DescribeRepositories",
"ecr:GetAuthorizationToken",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:ListImages"
]
}
]
}
For security, the actual id is replaced with aws_account_id
I have followed this guide on troubleshooting which states:
You can receive this error due to one of the following issues:
Your launch type doesn't have access to the Amazon ECR endpoint
I believe Fargate has access to ECR
Your Amazon ECR repository policy restricts access to repository images
I believe it permits
pull access
for role used - see repo permissions above.Your AWS Identity and Access Management (IAM) role doesn't have the right permissions to pull or push images
I believe it does have necessary permissions - See task role context above.
The image can't be found
The image is in ECR and permissions are above
Amazon Simple Storage Service (Amazon S3) access is denied by your Amazon Virtual Private Cloud (Amazon VPC) gateway endpoint policy
I believe so. IAM permission is set per above
S3 read access
, furthermore, no explicit endpoint policy has been put in place, which according to docs, means full access by default.
To pull images, Amazon ECS must communicate with the Amazon ECR endpoint.
Routing table defined in the VPC:
with all of the VPC's subnets associated. So the VPC and anything running in it should be able to see the internet - The security policy used for the task currently allows all ports (temp while troubleshooting ECR issue).
What am I missing that I am still getting this error?
This works using an EC2 instance - If I create a task that uses an EC2 instance with all other things being equal (where applicable) EXCEPT
EC2: Network Mode = Bridge
Fargate: Network Mode = awsvpc
The container provisions and runs - and the web app that runs in container is running normally. But in Fargate, Network Mode MUST be awsvpc
Fargate only supports network mode ‘awsvpc’.
I think this is where the problem resides, but do not know how to remedy.
The task definition is:
{
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::aws_account_id:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"dnsSearchDomains": null,
"logConfiguration": {
"logDriver": "awslogs",
"secretOptions": null,
"options": {
"awslogs-group": "/ecs/deploy-test-web",
"awslogs-region": "us-west-2",
"awslogs-stream-prefix": "ecs"
}
},
"entryPoint": [],
"portMappings": [
{
"hostPort": 8080,
"protocol": "tcp",
"containerPort": 8080
}
],
"command": null,
"linuxParameters": null,
"cpu": 1,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": null,
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": null,
"volumesFrom": [],
"stopTimeout": null,
"image": "csrepo/test-web-v4.0.6",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": null,
"interactive": null,
"healthCheck": null,
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "test-web-six"
}
],
"placementConstraints": [],
"memory": "2048",
"taskRoleArn": "arn:aws:iam::aws_account_id:role/ecsTaskExecutionRole",
"compatibilities": [
"EC2",
"FARGATE"
],
"taskDefinitionArn": "arn:aws:ecs:us-west-2:aws_account_id:task-definition/deploy-test-web3:4",
"family": "deploy-test-web3",
"requiresAttributes": [
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-awslogs"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.private-registry-authentication.secretsmanager"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.task-eni"
}
],
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "1024",
"revision": 4,
"status": "ACTIVE",
"inferenceAccelerators": null,
"proxyConfiguration": null,
"volumes": []
}
I solved this problem by removing and creating again ECR repository
Try to add this AWS managed policy: AmazonEC2ContainerServiceforEC2Role