Our Nomad agents sometimes fails jobs because they cannot pull images from ECR. /var/log/docker will contain messages like:
Not continuing with pull after error: error pulling image configuration: Get https://prod-eu-west-1-starport-layer-bucket.s3.eu-west-1.amazonaws.com/uuid/uuid?...: dial tcp IP:443: i/o timeout
The jobs are not really idempotent, so I would prefer not to have Nomad retry them.
Is there any way I can instruct Docker to retry these operations?
0 Answers