We're currently setting up a server to some heavy lifting (ETL) after another process has finished within the business, at the moment we're firing off jobs either via scheduled cron jobs or remote execution (via ssh). Early on this week we hit a issue with too many jobs running side by side on the system which brought all the jobs to a snail pace as they fought for CPU time.
I've been looking for a batch scheduler, a system where we can insert jobs into a run queue and the system will process them one by one. Can anyone advise on a program/system to do this with? Low cost / FOSS would be appreciated due to the shoe-string nature of this project.
I'd set up some kind of queueing service. A quick Google on "ready to use" stuff shows this:
Depending on your needs you could simply
Actually there's more to it, you could have requirements that implement a priority queue which brings up problems like starving jobs or similiar but it's not that bad to get something up and running quite fast.
If LDP as suggested by womble I'd take that. Having such a system maintained by a larger community is of course better than creating your own bugs for problems others already solved :)
Also the queuing service has the advantage of decoupling the resources from the actual number crunching. By making the jobs available over some network connection you can simply throw hardware at a (possible) scaling problem and have nearly endless scalability.
Two solutions spring to mind:
xargs -P
to control the maximum parallel processes at one time.make -j
.They are actually both summarised in this SO thread in more detail.
There is a possibility that these may not be applicable to the structure of your scripting.
A heavy weight solution to your problem is to use a something like Sun Grid Engine.
Sun Grid Engine (SGE). SGE is a distributed resource management software and it allows the resources within the cluster/machine (cpu time,software, licenses etc) to be utilized effectively.
Here is a small tutorial on how to use SGE.
You could check out some of the batch-systems used for scheduling jobs on clusters, which has the option to monitor resource usage and declare a system to be too loaded to dispatch more workload to it. You could easily also configure them to only run one job at a time, but for that you may be better off with something less complex than a full fledged batch scheduler (in the spirit of keeping things simple).
As for freely available batch/scheduling systems, the two that springs to mind would be OpenPBS/Torque and SGE.
Edited to add: If you're ever going to add more processing capacity in the future in the form of more boxes, a batch/scheduling system like Torque/OpenPBS/SGE may be good choices as they're basically built to manage compute resources and distribute workloads to them.
You can always use lpd -- yeah, old school, but it's really a generalised batch processing control system masquerading as a print server.
From
man batch
:I think this might be what you're looking for. It's part of Debian's
at
package.wava
: a memory-aware scheduler that allows to enqueue batch jobs (submitted with a maximum physical memory usage promise) to be executed when enough physical memory (RSS) is available in the system.We used Control M for this exact reason with ETLs and such (but a few years back now). Surely it's not free or open source but it had very good flexibility in terms of batch processing (a la if-this-then-that type of execution flow)
A shell script called up by cron could easily do this, it processess it line-by-line.
I would use Torque, which is an updated version of the FOSS OpenPBS.