I have scripts that run programs (that I did not author) and some are getting segfaults on a few inputs. I do these things in large batches that run for up to a week, and would like to know which input(s) are triggering the problem. As it stands, I get a notification from Bash that a certain script of mine has gotten a seg fault. But the problem is not in the script, it's in the 3rd party program and its input. If I had the name of the input, I could work on the problem.
The current form of the call in my Bash script is (for example program 'autofix')
for indata in base*.fix; do
autofix $indata >${indata/.fix/.stdout} &
done
As you can see, they are started in the background, and on my workhorse server, there may be as many as 100 or so of these started at once, so I cannot tell which one has failed, and I'm not patient enough to try all 100 one at a time, since each may take an hour. Capturing stderr does not capture anything so I'm looking for other ideas.
I found one in-line solution based just by playing with things I already knew (or was at least used to bumping into) in Bash. To test and to illustrate, I'm using a tiny C program I wrote that just causes a SEGFAULT:
Then after some trials (see comments below) I wound up with this simple pattern in Bash (which could be a one-liner but I'm presenting a more readable form):
This does not die because of the SEGFAULT, but just notices it. Therefore the output is "this died" and not "strange success" or a total process fault.
This approach shows the error code that distinguishes between SEGFAULT and other forms of process failure, in case that is desired.