I have spam in dmesg containing:
kernel:EDAC MC0: UE page 0x0, offset 0x0, grain 1073741824, row 3, labels ":": i3200 UE
Any idea what suppose to be wrong?
here are loaded modules:
# lsmod | grep edac
i3200_edac 3330 0
edac_core 46581 2 i3200_edac
edac-util does not show any errors
# edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow0: 0 Uncorrected Errors
mc0: csrow0: ch0: 0 Corrected Errors
mc0: csrow0: ch1: 0 Corrected Errors
mc0: csrow1: 0 Uncorrected Errors
mc0: csrow1: ch0: 0 Corrected Errors
mc0: csrow1: ch1: 0 Corrected Errors
mc0: csrow2: 0 Uncorrected Errors
mc0: csrow2: ch0: 0 Corrected Errors
mc0: csrow2: ch1: 0 Corrected Errors
mc0: csrow3: 0 Uncorrected Errors
mc0: csrow3: ch0: 0 Corrected Errors
mc0: csrow3: ch1: 0 Corrected Errors
mc0: csrow4: 0 Uncorrected Errors
mc0: csrow4: ch0: 0 Corrected Errors
mc0: csrow4: ch1: 0 Corrected Errors
mc0: csrow5: 0 Uncorrected Errors
mc0: csrow5: ch0: 0 Corrected Errors
mc0: csrow5: ch1: 0 Corrected Errors
mc0: csrow6: 0 Uncorrected Errors
mc0: csrow6: ch0: 0 Corrected Errors
mc0: csrow6: ch1: 0 Corrected Errors
mc0: csrow7: 0 Uncorrected Errors
mc0: csrow7: ch0: 0 Corrected Errors
mc0: csrow7: ch1: 0 Corrected Errors
this seems to be a memory error, but not a fatal one.
will prevent the spam on the console until the next reboot.
Basically the ce_errors are short for correctable errors (aka no "defect" outside the ram).
see kernel docu about edac and edac wiki for more details.
While i may be totally wrong we have a server with this (ECC RAM) and as there are no uncorrectable errors and memdisk did not show any issues either i let it run with the same ram, change the outputting, started monitoring the uncorrectable errors and there was no further issue for us.