We have a docker container which can service only one request from user at a time. So, we want to scale up and down whenever there is a request from user.
But in the docker swarm I can see only setting number of containers for a service as static by just providing the number. We need like whenever there is a new request from user a container should scale up if existing container is interacting with other user/session and destroy once the session is complete or scale down.
Can anyone, please suggest how to do that.
AFAIK, Docker Swarm does not offer automatic and dynamic scaling based on resource utilization. The common theme today is to use Kubernetes (K8s) or other container orchestration platforms to do it for you. That being said, you can still write a custom script or application that actively monitors the utilization of your containers and issues commands to your swarm, but that's not a stable nor effective solution.
It also depends where you're running your containers. If it's in the cloud (depending on the provider), they all offer managed services for container orcestration or even better, they have a K8s solution as well. K8s has somewhat of a learning curve but that's how containers are handled now. If you're running them on a single computer or bare metal, there are frameworks that utilize K8s such as
minikube
.You can use Docker Flow Monitor. The stack to achieve auto scaling with Docker swarm includes:
The linked example is based on response time but it should be possible to base the scaling events on a different metric.
However in this case it would be better to fix the microservice so that it can handle multiple concurrent sessions.
I was thinking about this as well. One thing to remember is that containers within one node fill the node to maximize utilization of the node...if the node becomes overloaded, it doesn’t really matter how many containers you have, you have to scale out to another node, where the containers resize and given the same number of containers they have more ram per container. If you simply create more container replicas than you expect to have nodes , then set the replica mode to global, you shouldn’t even have to write a script to scale up containers. If you’re on AWS, you could create cloud watch alarm metrics to monitor each nodes cpu usage, scale up automatically on alarm of some threshold, and have each node automatically join the swarm as the containers balance over the swarm. You could also write a script to monitor the number of nodes your swarm actually has periodically and then redeploy the service with a few more replicas than you have nodes.