I have a simple two node cluster setup that has been working fine for the last couple of weeks. I haven't made any changes on my nodes, but a few days ago data metrics stopped showing up. By all indications everything else is working fine and OpsCenter is able to see if my nodes are running or not without any problems. Also no errors are reported in the GUI.
I've seen a couple other posts although the solutions are not related to my scenario. I do not have a heavy load on the server. I have less than 10 column families as it's just for testing and there is no thift password configured.
When I look in the opscenterd.log I see the following:
2015-06-09 00:16:40+0000 [] ERROR: Error fetching metric data: Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/opscenterd/MetricFetcher.py", line 470, in _fetch_through_cache
UnavailableException: UnavailableException()
2015-06-09 00:16:40+0000 [] ERROR: Problem while calling NewMetricsController (IndexError): list index out of range
File "/usr/share/opscenter/lib/py-debian/2.7/amd64/twisted/internet/defer.py", line 1020, in _inlineCallbacks
result = g.send(result)
File "/usr/lib/python2.7/dist-packages/opscenterd/MetricFetcher.py", line 612, in fetchMetrics
And in agent.log I see this:
ERROR [os-metrics-5] 2015-06-09 17:47:41,161 Long os-stats collector failed: Cannot run program "iostat": error=2, No such file or directory
ERROR [os-metrics-4] 2015-06-09 17:47:41,162 Long os-stats collector failed: Cannot run program "iostat": error=2, No such file or directory
Any ideas on how to resolve this?