I'm interested in learning about tools and techniques used to manage many Linux machines. (That is, deploying and maintaining updates.)
One way that I thought of to do this is write a Bash script that uploads another script to the servers and executes the script for each server sequentially. For example:
foreach server
{
connect to server and scp update_script.sh to ~/scripts
ssh user@server -e "sh ~/scripts/update_script.h"
}
And update_script would use apt-get/aptitude
or yum
, or whatever to update packages on the server.
Are there better ways to do things like this?
Try puppet
Another excellent (truly excellent) tool is Webmin, if you add several servers running webmin together (in the webmin interface) you can push updates and view package configurations in its cluster pages.
An alternative, which is more geared to rolling out images, is SystemImager
ClusterSSH is what you're looking for. It provides a way to broadcast commands to all nodes in a cluster. Think of it like BashReduce sans Reduce.
Someone else already mentioned Puppet.
In the same vein, I can recommend Cfengine. The learning curve can be a little steep, but once you get the hang of it, it's great. I use it to manage about 50 servers and can't believe I ever got along without it.
Try Capistrano. It works just like your bash foreach loop above but it is based on Ruby instead of bash. Capistrano is used for operational tasks (a la: put server in maintenance mode, take out of maintenance mode)
+1 for Puppet. It's a good fit for idempotent operations that leave a system in a known state.
If you want to be able to run commands in parallel on a cluster of Linux systems, one of the following may be of interest:
As a general way of configuring a large network of systems - you probably want to be using tools already mentioned such as cfengine and puppet.
Check out Func for running "things" on several servers at once. Google for "func redhat"
Also if you like puppet you should check out Chef opscode dot com. It solves a few of puppets problems and the config files are written in ruby instead of a DSL. also ohai is a more informative version of facter. Bcfg2 is also another config management tool written in python. Its cool for high security or auditing because every single package can be accounted for, so if something gets added outside of bcfg2 it flags an alert. Its down side is the config files them selves are written in XML.
I had a bunch of links but I guess new users cant post hyperlinks.
I don't know much about script deployment even if you idea does not seem to bad to me. However, if you want to monitor your linux machines in your process of managing them, I strongly recommand Nagios for this task.
Nagios is a bit a pain to configure (like any linux software, mind you) if you want it to handle more than the usual default tasks, but you have an extensive documentation on their website: nagios.sourceforge.net/docs/3_0/toc.html
Of course, it is free ;)
How big of a group is "many"? As mentioned already, Webmin is good if it does what you want - you can run arbitrary commands and do some common admin tasks (like synchronizing local users) using https. So, you have less overhead than the ssh method. Webmin's cluster tools are good for "tens" of machines; I've never tried it on larger groups. There's also cfengine, which was mentioned - it can be used for small to extremely large groups of machines (as in, thousands), because it can do tiered management as needed (one master, then some sub-masters, etc). I'm currently using this to manage a network which is about 3500 Unix machines of different flavors. As the other poster said, it's a pain to initially learn, but works very well.
If your systems are homegeneous, or at least fairly homogeneous groups, and you don't have a whole lot (like, under a few hundred), there are several good cluster management toolsets out there. The Oscar project has some admin tools already assembled for use in managing a cluster: http://svn.oscar.openclustergroup.org/trac/oscar, and there are other similar projects whose names escape me (I'll probably remember as soon as I post).
In simpler terms, there are a few paralell ssh tools discussed at the Linux Journal: http://www.linux.com/archive/feature/151340. Like the other ssh-based cluster tools though, you'll still start running into problems when you try to open too many concurrent ssh connections. Depending on your hardware, my esperience is that you'll probably want to keep paralelism below 20-30 simultaneous ssh links.
Basically, there are a whole lot of pre-made solutions for this, and there are just as many home-grown solutions. Look around freshmeat.net and google and you'll find several. Or roll your own; it's not a particularly difficult challenge to solve acceptably if you're just doing 10-20 machines... :)
I use cfengine to administrate ~150 Linux machines. I have to log in to any given machine less than once a week, and cfengine does the rest. Adding users, removing them, installing packages, etc.