We've got DB2 LUW running on a RHEL box. We had a crash of DB2 and IBM came back and said that a file that DB2 was trying to access (through open64()) unmounted or became invalid.
We have done nothing but restart the database and things seem to be running fine. Also, the file in question looks perfectly normal now:
$ cd /db/log/TEAMS/tmsinst/NODE0000/TEAMS/T0000000/
$ ls -l
total 557604
-rw------- 1 tmsinst tmsinst 570425344 Jan 14 10:24 C0000000.CAT
$ file C0000000.CAT
C0000000.CAT: data
$ lsattr C0000000.CAT
------------- C0000000.CAT
$ ls -l
total 557604
-rw------- 1 tmsinst tmsinst 570425344 Jan 14 10:24 C0000000.CAT
With those facts in hand (please correct me if I am mis-interpreting the data at hand) what could cause a file system to 'spontaneously unmount or become invalid for a short time'?
What should my next step be?
This is on Dell hardware and we ran their diagnostic tools against the hardware and it came back clean.
My guess would be underlying hardware issue, for example a drive disconnecting and reconnecting to the bus. Examing
/var/log/messages
(and rundmesg
) and look for unusual scsi or sata messages about disconnects, etc.