I have a production Ubuntu server in a remote location, which recently started to behave strangely.
I suspect RAM errors, and want to have a physical RAM check without rebooting, Live CDs or memtests
which causes downtime.
I know that on-line RAM test is a contradiction in terms (because a full physical check requires that no process will be running), but I wonder if there is any way to make a random physical check which might give some indication of RAM failure.
Thanks,
Adam
Sounds like your concerned with uptime. What I have done in the past is to make a VM that mirrors the system in question that is run on separate physical host. Then do what necessary diagnostics are necessary and then restore the physical system once it has been resolved. Just an idea to your situation if you have another system you could use.
If uptime really is a worry, there's really only ane answer.
You then either know it was the old RAM (and can test it in another machine to drill it down to one machine) or you know you have more tests to do.
Leaving failing RAM in a production box will only cause you more issues down the line.