Trying to diagnose random hardware freezes and did the natural thing; started memtesting everything.
DDR400 DDR2 Ram CAS 5-5-5-18 Four sets of sticks; 2x2GB PNY's and 2x1GB Elixirs
So far I've tested one of the Elixirs and one of the PNY's, and both are failing at the same point (around 129MB) and only failing on the Random Number Sequence test.
Am I missing something? Is this normal failure? Is it a case of getting new ram or throwing the Motherboard out?
I'm trying to get my hands on another DDR2 machine to do comparative tests but thought I'd ask anyway.
In some of the servers I managed, we started to get many ECC errors in the kernel logs. Even after the whole the RAM set changed, the errors continued as we did nothing to the server. At that point we concluded that memory controllers fried and the motherboard swapped out.
This might be the very case for you.
I would usually assume a timing problem (have you tried using more conservative values?) or actual marginal hardware - could be dodgy cache memory (might be on mainboard or cpu), dodgy memory controller/CPU, or RAM modules that electrically overload/mismatch the memory bus... if there is any gunk (stickers...) on the mainboard PCB try removing such (To very technical people: I know there is usually insulating soldermask on the traces, however a lossy object nearby could be just enough to get a microstrip out of whack!)
Edit: NOT stickers the manufacturer put on. However, some assembly houses put their own serial/stock number stickers on components, I wonder that they do not regularly cause more problems that way.