I'm trying to setup a connection to a MongoDB Atlas database from an AWS Fargate container. The VPC peering is setup and works and I can successfully connect to the MongoDB Atlas cluster from a bastion within the private subnets of the AWS VPC. However when I try the same conenction from a Fargate task it fails to connect.
For instance if I attempt to connect with the following mongo cli command:
mongo "mongodb+srv://user:[email protected]/database"
The I get the following error.
MongoDB shell version v4.0.20
connecting to: mongodb://cluster0-shard-00-01.foo0.mongodb.net.:27017,cluster0-shard-00-02.tzhow.mongodb.net.:27017,cluster0-shard-00-00.foo0.mongodb.net.:27017/cxchat?authSource=admin&gssapiServiceName=mongodb&replicaSet=atlas-mdt101-shard-0&ssl=true
2020-09-09T13:16:46.295+0000 I NETWORK [js] Starting new replica set monitor for atlas-mdt101-shard-0/cluster0-shard-00-01.foo0.mongodb.net.:27017,cluster0-shard-00-02.foo0.mongodb.net.:27017,cluster0-shard-00-00.foo0.mongodb.net.:27017
2020-09-09T13:16:56.351+0000 W NETWORK [ReplicaSetMonitor-TaskExecutor] Unable to reach primary for set atlas-mdt101-shard-0
2020-09-09T13:16:56.351+0000 I NETWORK [ReplicaSetMonitor-TaskExecutor] Cannot reach any nodes for set atlas-mdt101-shard-0. Please check network connectivity and the status of the set. This has happened for 1 checks in a row.
2020-09-09T13:17:11.867+0000 W NETWORK [js] Unable to reach primary for set atlas-mdt101-shard-0
2020-09-09T13:17:11.867+0000 I NETWORK [js] Cannot reach any nodes for set atlas-mdt101-shard-0. Please check network connectivity and the status of the set. This has happened for 2 checks in a row.
*** It looks like this is a MongoDB Atlas cluster. Please ensure that your IP whitelist allows connections from your network.
2020-09-09T13:17:11.868+0000 E QUERY [js] Error: connect failed to replica set atlas-mdt101-shard-0/cluster0-shard-00-01.foo0.mongodb.net.:27017,cluster0-shard-00-02.foo0.mongodb.net.:27017,cluster0-shard-00-00.foo0.mongodb.net.:27017 :
The same command works fine from a EC2 in the VPC in a private subnet (same subnets as assigned to the ECS container).
I understand that Fargate networking is a bit different. The task is setup with AWSVPC as the NetworkMode. The error suggests that a whitelist entry might be needed on the Mongo Atlas side, but I've checked this and the task IP is 10.2.0.129 which is comfortably within the white list assigned on Atlas of 10.2.0.0/16.
Has anybody tried this with Fargate or anything similar? I would have thought that the VPC peering connection would also be active on the Fargate task given it is setup in the same VPC/ Subnets etc.
I suspect it's something to do with the security group, perhaps the outbound rules are different / missing? Or perhaps with routing? Maybe some subnets don't have the right route table with the VPC peering entry attached to them?
Here's what I would do:
Spin up an EC2 instance in the same Subnet where your Fargate container runs and assign it the same Security Group and the same IAM Role. With that they should behave the same, however EC2 is easier to debug.
Now test mongo access - if it doesn't work figure out why:
tcpdump
to find out where themongo
command tries to connect to - and what happens next? Does it connect? Does it get a reply?You can try to run the container on EC2-based ECS cluster, switching between Fargate and EC2 is simple.
Hope some of it helps :)