Ping a Specific Port

Question

Alai

Asked: 2021-02-17 14:42:56 +0800 CST2021-02-17 14:42:56 +0800 CST 2021-02-17 14:42:56 +0800 CST

"Dezentral" job management with redundancy/load balancing

772

I search a job management solution for local processes. Usually they run for some weeks. At the moment I use a jenkins, but there the server is not restartable (security updates) and there is no redudancy. If one server goes offline, all the jobs should be rebalanced to the online servers. It is okay to just start the script again with the same parameters, but it should be possible to disable this behavior. Also it should be easy to add/remove new servers.

I dont need a full solution for everything, but I search for a software like this and did not really find, what I was looking for. I appreciate any hints (also search keywords) pointing to the right direction. I basically just found CI software, but I want a server fault tolerant solution.

1 Answers

Voted

KHobbits · Answer 1 · 2021-02-26T16:39:27+08:00

There are a number of ways to skin this cat.

One solution is to use a 'workflow' tool chain. Generally you start with a message queue, which is where you queue up jobs, something like RabbitMQ, Redis, or AWS SQS. Followed by some sort of task runner or executor, like Sidekiq or Celery.

The benefits of this sort of workflow is so that you can scale it out, handling things like failed tasks, failed servers, retry logic, reporting etc.

You can spin up clusters of the database component, and clusters of the worker component, which would let you build in the redundancy.

There are also compute schedulers, something like Kubernetes. Here you play Tetris with the available resources you have across servers, and jobs will be scheduled until you run low on resources.

A third solution could be to use task monitoring tools, such as Monit, or Supervisord, which are designed to monitor processes and restart them when they go down. This approach requires you to handle most the edge cases yourself, but would likely be easier to get going quickly.

A forth, simpler still solution, could be to use something like cronjob, or windows scheduled tasks. Here your code runs on a schedule. You could scale this out by adding more servers, but run into the same issue as the solution above, that you'd have to handle things like race conditions in your own code.

All of the above solutions, can be managed by infrastructure & config management tools, things such as Terraform & Ansible, which would allow you to keep things uniform, to ease updates, and redeployments.

"Dezentral" job management with redundancy/load balancing

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?