I've got an ESXi 3.5 server on "unsupported" hardware - SuperMicro motherboard, Adaptec 9410 3405 RAID controller (oops, the 9410 is the onboard controller that's not being used for RAID) - and I'm starting to wonder what the point of RAID is, because we don't have any monitoring on it.
Is it possible to have monitoring of RAID on ESXi apart from 100% compatible systems / using a paid product like vSphere, or should we switch to an "endorsed" hardware system or perhaps a SAN?
Update: I've found this Adaptec knowledge base article which says that ESXi has no monitoring support:
There is a AACRAID driver embedded in ESXi Server 3.5 (see VMware Certified Compatibility Guides) but there is no Management Software (ASM or ARCCONF) available.
The card does however have diagnostic LEDs with headers, so I suppose some sort of hardware hack might be a last resort.
Of course ESXi has thorough and extensive hardware monitoring support (that article talks about one Adaptec-flavour of monitoring). I can tell you everything that's going on in every part of my fully supported hardware, if Adaptec make an ESX/ESXi driver for your adapter with hardware RAID support then it'll pass pre-failure and failure warnings up to ESX/ESXi which can then in turn forward them on either via vCenter or SNMP.
I am not sure about ESXi monitoring the raid but does your supermicro have a management card? those generaly have SNMP and you can easily monitor the Raid, fans etc.
going with fully supported hardware + SAN would be ideal in production environments if you need it.
Most of our Virtual stuff is on Dell R710 with Dell Equalogic PS6000e's for storage - the combo works great. plenty of monitoring and excellent performance/RoI etc.
However, arcconf can be easily patched to be used with ESXi 4.1 (change device filename generation and file locking check).
This information may be out of date, but what we found a year or two ago when we set up ESXi was that only certain cards supported the RAID monitoring through ESXi. We had initially selected an Adaptec controller, but just couldn't get it working. We switched to the LSI 8708ELP and have access to the RAID array information. I don't remember the specifics, but there was a specific protocol that pretty much only the LSI supported. We are also using Supermicro hosts and have the "+" model management cards, but they don't offer access to the array information.
We use nagios with the check_esx_wbem.py check command to monitor the array. It works very well, and has definitely detected array degraded issues.
Another option would be to use either a controller that has out-of-band monitoring. For example, I don't know if they're supported but the higher end Areca cards have an Ethernet port on them.
Or use an external SAN, such as the ~$3,500 Drobo Elite or $30,000+ Equallogic. These are stand-alone devices and at least in the Equallogic case can definitely be monitored via Nagios completely separately from VMWare. I haven't used the Drobo but imagine it has some monitoring available as well.