Ping a Specific Port

Question

stefgosselin

Asked: 2012-03-16 20:59:47 +0800 CST2012-03-16 20:59:47 +0800 CST 2012-03-16 20:59:47 +0800 CST

Tuning zabbix: What is the number of proceesses deemed to be reasonable on a server

772

Yes, so I am getting to grips (and loving) zabbix, and have started the process of fine-tuning the alerts.

I have this alert that is triggered on a linux server for having over 300 processes.

Now, this is sort of a central server that acts as a firewall and runs a bunch of stuff.. namely proxy/httpd-server/mysql/open-vpn/zabbix

Is there anything to look out for before I pop up the alert trigger to 350 processes?

The cpu load is still relatively quite low, I was thinking maybe one would check other stuff before upping the alerts.

Would I need to check if machine is bottle-necked elswhere, ie IO bound?

Any good advice for this or good documentation (hopefully well-written and easy to understand) as always would be greatly appreciated.

2 Answers

Voted

Janne Pikkarainen · Answer 1 · 2012-03-17T04:04:20+08:00

Like @sam said, it all depends on what the server is doing and how beefy is the server hardware. Running only handful of extremely CPU, memory and/or I/O intensive processes can easily overload even a powerful server. Especially if something makes your server swap, everything will be moving ahead slower than a snail or a turtle.

On the other hand, something like Postfix server can easily have the process count in hundreds, or thousands, as everything Postfix does is very light-weight.

In my opinion monitoring (or at least alerting because of) global process count is not useful. Though if you know for sure that there should not be more than X instances of some process around, then monitor that and raise an alert in an event there suddenly are more than X pieces of them around.

You can also graph amount of some processes for trends: for example, I tend to graph Cyrus IMAP/POP process count so I can see if they hover anywhere near current hard limits.

If you have some predictable process behaviours, you can use something like psmon for automatically restarting/killing (with optional logging / e-mailing for info about events psmon handled) misbehaving processes. Sure thing, Zabbix can be used for this, too, but psmon is very easy to configure for this kind of tasks.

What I would graph and monitor

In general, graph (and monitor) at least the following:

load average
memory usage
disk usage
cpu usage
amount of network traffic
amount of some individual processes if you need to
response times for your services
server uptime (can be a very useful graph; if some server starts to misbehave and needs to be rebooted often, it's easy to spot from the graphs the moment problems started)

Then monitor the at least the following:

are the processes that should be up responding correctly; in my opinion just testing if the port is up or if the process is present if not enough. Instead, if you want to check if web server is running, see if it returns HTTP 200 OK and preferably see if the test page contains some expected strings.
server ping. If ping fails, alert immediately.
kernel logs for severe things such as I/O errors, failed paths in SAN environment multipath configuration, kernel panics, OOM events, and so on

I hope this helps you. :)

Smudge · Answer 2 · 2012-03-17T01:01:10+08:00

I think it's very hard to answer this without more information but I'll give it a go.

It depends;

Having five FFMPEG threads rendering HD video on a single core server would be too many, but it could probably quite happily run hundreds, even thousands, of 5 line Python scripts with no problem. In general, monitor everything you can think of! If it outputs a number, monitor it and log it, you never know what stats you might need down the line. Number of processes is probably, on it's own, a poor measure of performance, it's useful in conjunction with other information, say if the server had just gone down it's useful to look at running procs, CPU/load, memory, disk IO etc. but I'd probably say, unless you can exactly determine how much CPU/memory/etc. each process uses it's not that useful.

Say if you have a very predictable application, each user starts one proc on the server, and each proc uses 10MB memory, 1% of the available CPU usage and 1% of the available disk IO continuously for the duration the proc is running. Assume the base usage of the system is constantly 3% CPU and 500MB memory and no other processes will be started on the box other than your application. From that it's very easy to predict how many threads you can run before getting issues, but I don't think I've ever seen an application with such precise usage.

A much better strategy would be to monitor the resources used by a particular process/processes, say if you're running an Apache server with mod_php, monitor the average memory, CPU and disk IO of the httpd processes, this will give you a much better insight into what your server is actually doing. Alerting on process usage is not that useful, monitoring it is. There are many things which can push the process count up without having any effect on the system, but a single process could take a server down.

TL;DR

Process count isn't that useful of an alert
You should still be logging it
Figure out what your server is doing, and monitor what's relevant to it

Tuning zabbix: What is the number of proceesses deemed to be reasonable on a server

TL;DR

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?