Ping a Specific Port

Question

John O

Asked: 2011-02-09 10:11:15 +0800 CST2011-02-09 10:11:15 +0800 CST 2011-02-09 10:11:15 +0800 CST

I have a perl script that is supposed to run indefinitely. It's being killed... how do I determine who or what kills it?

772

I run the perl script in screen (I can log in and check debug output). Nothing in the logic of the script should be capable of killing it quite this dead.

I'm one of only two people with access to the server, and the other guy swears that it isn't him (and we both have quite a bit of money riding on it continuing to run without a hitch). I have no reason to believe that some hacker has managed to get a shell or anything like that. I have very little reason to suspect the admins of the host operation (bandwidth/cpu-wise, this script is pretty lightweight).

Screen continues to run, but at the end of the output of the perl script I see "Killed" and it has dropped back to a prompt. How do I go about testing what is whacking the damn thing?

I've checked crontab, nothing in there that would kill random/non-random processes. Nothing in any of the log files gives any hint. It will run from 2 to 8 hours, it would seem (and on my mac at home, it will run well over 24 hours without a problem). The server is running Ubuntu version something or other, I can look that up if it matters.

7 Answers

Voted

phemmer · Answer 1 · 2011-02-09T10:18:50+08:00

phemmer

2011-02-09T10:18:50+08:002011-02-09T10:18:50+08:00

Put in signal handlers for all the signals (TERM, SEGV, INT, HUP, etc) and have them log out whenever they are hit. It wont tell you what is sending the signal, but it will allow you to see what signal it is and perhaps ignore it.

$SIG{'TERM'} = $SIG{'INT'} = sub { print(STDERR "Caught SIG$_[0]. Ignoring\n"); };

That would print out when it caught a sigterm or sigint and then return control back to the program. Of course with all those signals being ignored, the only way to kill it would be to have the program itself exit, or to send it a SIGKILL which cant be caught.

5

Jeff Albert · Answer 2 · 2011-02-09T13:56:41+08:00

I realize this isn't exactly an answer to the question you asked, so I apologize if it's somewhat off-topic, but: does your app really need to run continuously, forever? Perl is not the most resource-thrifty environment in the world, and while the overhead of interpreter start-up is not without its drawbacks, extremely long-running scripts can have troubles of their own - memory leaks, often at a level below your control, are the bane of the vanilla-perl developer's existence, which is why folks often mitigate those issues either by running in a more formally resource-conservationist sub-environment like Perl::POE, or by handing over the long-running listener part of the job to a front-end service like xinetd and only executing the perl component when work needs to be done.

I run several perl scripts which run continuously reading and processing the output of our (considerably large) central syslog stream; they suffer from terrible, inexplicable "didn't free up memory despite pruning hash keys" problems at all times, and are on the block to be front-ended by something better suited to continuous high-volume input (an event queue like Gearman, for example), so we can leave perl to the data-munging tasks it does best.

That went on a bit; I do apologize. I hope it's at least somewhat helpful!

Vatine · Answer 3 · 2011-02-09T10:25:54+08:00

Best Answer

Vatine

2011-02-09T10:25:54+08:002011-02-09T10:25:54+08:00

Without much in the way of actual knowledge, I'd start looking in dmesg output or assorted syslogs if the OOM killer is running. If so, that's probably it.

2

Tobu · Answer 4 · 2011-02-09T10:33:26+08:00

Tobu

2011-02-09T10:33:26+08:002011-02-09T10:33:26+08:00

Syslog is the first thing to consult. If it isn't sufficient…

You can't determine who sends a signal to a process. It could be another process, it could be the kernel, etc. Short of involving the very recent perf framework some guesswork is involved.

However, you can set up some better monitoring. The atop package, in debian/ubuntu, sets up a service that will log system load and per-process activity (disk, memory, cpu). You can then consult those logs and get a feel of what was happening at the time the process crashed.

Crash course: sudo atop -r, navigate with t and T, type h to get help about the various visualisations.

Also consider adding a signal handler that dumps pstree to a temporary file.

2

Lev Bishop · Answer 5 · 2011-02-09T19:10:05+08:00

Lev Bishop

2011-02-09T19:10:05+08:002011-02-09T19:10:05+08:00

Likely you are running into resource limits. For example CPU time. Try ulimit -a to check. If it's only a soft-limit, set in a login script then you can fix it with, eg, ulimit -t unlimited. If it's a hard limit, as is set for example for regular users on OpenBSD and other OSs, then you'll have to override.

2

Juraj · Answer 6 · 2011-02-09T15:14:37+08:00

Juraj

2011-02-09T15:14:37+08:002011-02-09T15:14:37+08:00

Until you nail the issue, running the script with

nohup scriptname

can help. If it still crashes, examine the nohup.out file.

And if nothing mentioned here helps, I'd try to use strace/ltrace to see what system or library calls script was doing before failure, but they generate a LOT of output.

1

jqa · Answer 7 · 2011-02-09T19:22:44+08:00

jqa

2011-02-09T19:22:44+08:002011-02-09T19:22:44+08:00

In a previous life I found a DEC Ultrix box that had a very clever cron job which looked for all processes with more than 1 CPU hour and killed them. Which was why the nightly batch report job died every night.

Any clever cron jobs/scripts that might be killing it? Or it might be another performance tuning parameter or somethng like the ulimit answer already given.

1

I have a perl script that is supposed to run indefinitely. It's being killed... how do I determine who or what kills it?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?