We have a VMware vSphere 5 environment running CentOS 5.8 virtual machines. In the past two weeks we have had five incidents of virtual machines having a filesytem become corrupt, requiring an fsck to repair.
Here is what we see in the logs:
Nov 14 14:39:28 hostname kernel: EXT3-fs error (device dm-2): htree_dirblock_to_tree: bad entry in directory #2392098: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Nov 14 14:39:28 hostname kernel: Aborting journal on device dm-2.
Nov 14 14:39:28 hostname kernel: __journal_remove_journal_head: freeing b_committed_data
Nov 14 14:39:28 hostname last message repeated 4 times
Nov 14 14:39:28 hostname kernel: ext3_abort called.
Nov 14 14:39:28 hostname kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal
Nov 14 14:39:28 hostname kernel: Remounting filesystem read-only
Nov 14 14:39:28 hostname kernel: EXT3-fs error (device dm-2): htree_dirblock_to_tree: bad entry in directory #2392099: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Nov 14 14:31:17 hostname ntpd[3041]: synchronized to 194.238.48.2, stratum 2
Nov 14 15:00:40 hostname kernel: EXT3-fs error (device dm-2): htree_dirblock_to_tree: bad entry in directory #2162743: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Nov 14 15:13:17 hostname kernel: __journal_remove_journal_head: freeing b_committed_data
The problem seems to happen while we are rsync'ing application data from another server. So far we have been unable to reproduce the problem, or identify a root cause.
After we had a few servers have this problem, we assumed that there was an issue with the template, so we scrapped all VM's cloned off of the template, destroyed the template, and built a new template from scratch, installed from a newly downloaded CentOS ISO.
We use HP EVA SAN's for datastores, and moved from a 4400 to a 6300 after the first problem. Since the move and rebuilding new virtual machines we have seen the issue twice. On one VM we shut down the server, removed two virtual CPUs, and booted it back up again, the problem presented itself almost immediately. On the other VM, we rebooted it, and the problem happened a half hour later.
Any tips or pointers in the right direction would be appreciated.
There is a KB regarding HP EVA, notably if you are using Round Robin PSP. Firstly you should check the vmkernel.log to check for storage errors. Relevant KB entry (pdf)
To optimize EVA array performance, HP recommends changing the default round robin load balancing IOPS value to 1. This update must be performed for every Vdisk using the following command on ESX4.x:
For ESXi5:
If the problem can be replicated only when you are rsyncing data from one server to another, then it means that it is related to how the data consistency is being seen from the point of kernel. If the kernel thinks that the filesystem is going to be damaged or has some damage, it will turn the fs into read-only.
I dunno much about HP EVA, but does it have battery backed write cache. If so, can you disable the on disk write cache and use the SAN array write cache. To do that, mount with mount -o barrier=1 and see whether you see any improvement.
And I have an instinct that it is somehow related to storage, not any fs fault. I am unsure how to prove it, but most of the cases that I have seen about fs corruption somehow and somewhere involves the storage as the culprit, if not the main.