Often when I find myself in front of a unix/linux (or any other *nix variant) console and have to quickly diagnose the server's condition, I just can't remember everything that should be checked.
I'll try a vmstat, some ps/top manoeuvres, read procinfo and some log files (boot & sys), but what I'd really like is a quick way to view Cpu, Hard disk and Physical Memory condition.
I seem to know a lot of it already is present in vmstat, but somehow I miss the ease of server 2008 where you can find a nice resource monitor while even the task manager itself can provide a quick peek on the system condition (and not even talking about server 2008's monitoring graph tools).
Any suggestion, or am I just being lame because vmstat really is the grail ?
Edit: Well thanks for the feedback, everyone. I should add that I'm not really talking about constant monitoring (where nagios is a very good proposition), but about an occasional walk to a server - not necessarily mine - to do a quick system condition lookup (sometimes I just happen to be somewhere and Bang, Hey, can you come over to check this one ?)
The stick with some utility-scripts is indeed nice, already have one with sysinternals apps for windows machines. Htop is cool also, although I don't think being able to install it wherever I happen to be.
if you want a bit of bells and whistles under linux - try htop.
it's top on steroids, you can configure it to display on 'bars' cpu time spend in userland/system/iowait/irqs. this might give you good view what's the cause of load.
still - some info that you get from vmstat will not be displayed in htop.
you can as well take a look at sar from sysstat. [ iostat mentioned by Kyle Brandt is part of the same package ].
Depending on how many servers you have you may want to setup nagios or a similar monitoring system for this. Basically you set limits on metrics (CPU usage, memory usage etc..), and if a limit is exceeded you receive an alert, which could be a page or email or whatever. However if this is your home PC, I find myself using nmon. It's great for getting an over all picture of your system. It will display info about memory, disk, CPU, and network usage, as well as kernel information.
top is a good tool (if its installed), but the other one I like for a real quick look to see if anything is wrong is dmesg. That should let you know if the server's experiencing something incredibly major (disconnected nics, disk faults, memory faults etc).
Don't forget iostat, part of the sysstat package. If you want something that easily portable, why not write a shell or Perl script that you can develop over time? This would be a good way to learn the differences between systems and get better at scripting. You can generally parse most of the information out of proc or just wrap up all those tools.
I generally run top as a first point of call when I log into a host that has reported problems. It gives you a good overview of cpu, memory, the runqueue length and then from that I can get an idea as what to investigate next. If iowait is up I look at iostat if memory is low I ps and see what processes are using memory (or just sort top by memory) etc...
The beautiful thing about unix is you don't have to accept the tools that are offered. Write a script that shows you the info you want.
iostat vmstat top ps (remember you can customise the output fields of ps - quite q lost with gnu ps) df dmesg /var/log/messages sar You could use syslogNG to filter all of the crtical log messages into separate log file.
One unusual(ish) thing that is worth checking on linux is /proc/mounts. Sometimes filesystem go read only, but that is not shown by mount, but is shown in /proc/mounts. I've seen this both on VMs and with FC storage (e.g. where a path has disspeared in a weird way).
Another 'top on steroids' program: atop. So verbose, it's scary.