Ping a Specific Port

Question

Gray Race

Asked: 2012-01-21 10:24:55 +0800 CST2012-01-21 10:24:55 +0800 CST 2012-01-21 10:24:55 +0800 CST

Anyone know how to fix issues with omsa on red hat 5.1 that reports "No Controllers found"?

772

I've got a Red Hat 5.1 server 64-bit Dell 2950 with a PERC 5/i controller that until recently was working fine.

On it I have an NRPE command check_openmange that started returning errors:

/usr/local/nagios/libexec/check_openmanage
Storage Error! No controllers found
Problem running 'omreport chassis memory': Error: Memory object not found
Problem running 'omreport chassis fans': Error! No fan probes found on this system.
Problem running 'omreport chassis temps': Error! No temperature probes found on this system.
Problem running 'omreport chassis volts': Error! No voltage probes found on this system.

Obviously these components exist as the system is up and running. I can access the web interface for Dell Open Manage and it reports everything is green.

Check openmange uses the omreport tool and this generates the above error directly:

[root@lynx tmp]# omreport storage controller
No controllers found

I've found a number of threads online relating to issues with OMSA and 64-bit RHEL 5 and CentOS 5 where they suggest running the 32-bit software on 64-bit systems:

However I'm already running the 32-bit software:

Installed Packages
Name   : srvadmin-storage
Arch   : i386
Version: 6.5.0
Release: 1.201.2.el5
Size   : 8.4 M
Repo   : installed
Summary: Storage Management accessors package, 3.5.0

Moreover most of these posts seem related to a PERC 4 and mine is a PERC 5. This check and report was stable until recently and has had production load on it for a number of months which makes me hesitant to take these steps. I have not however found any good indication of why this behavior changed.

Has anyone experienced this issue with PERC 5?

Does anyone have further thoughts on diagnosis steps or solutions?

5 Answers

Voted

asciiphil · Answer 1 · 2013-09-17T07:09:20+08:00

I assume you've done the basic troubleshooting steps of restarting OMSA (service dataeng restart) and making sure IPMI is loaded:

service dataeng stop
service dsm_sa_ipmi start
service dataeng start

One common non-obvious cause of this problem is system semaphore exhaustion. Check your system logs; if you see something like this:

Server Administrator (Shared Library): Data Engine EventID: 0  A semaphore set has to be created but the system limit for the maximum number of semaphore sets has been exceeded

then you're running out of semaphores.

You can run ipcs -s to list all of the semaphores currently allocated on your system and then use ipcrm -s <id> to remove a semaphore (if you're reasonably sure it's no longer needed). You might also want to track down the program that created them (using information from ipcs -s -i <id>) to make sure it's not leaking semaphores. In my experience, though, most leaks come from programs that were interrupted (by segfaults or similar) before they could run their cleanup code.

If your system really needs all of the semaphores currently allocated, you can increase the number of semaphores available. Run sysctl -a | grep kernel.sem to see what the current settings are. The final number is the number of semaphores available on the system (normally 128). Copy that line into /etc/sysctl.conf, change the final number to a larger value, save it, and run sysctl -p to load the new settings.

user214274 · Answer 2 · 2015-06-02T09:40:53+08:00

user214274

2015-06-02T09:40:53+08:002015-06-02T09:40:53+08:00

Following asciiphil's intructions worked for me. In my case nrpe had a lot of semaphores open related to open manage. Cleaned them out and restarted everything.

This failed:

omreport chassis memory
Memory Information

Error : Memory object not found

Make sure there are enough semaphores:

sysctl -a | grep kernel.sem
ipcs -s |wc -l

Stop nrpe which uses omreport:

/etc/init.d/nrpe stop

Remove nrpe semaphores:

ipcs -s | awk '/nrpe/ {print "ipcrm -s ",$2}  ' | sh 
/etc/init.d/dataeng stop
/etc/init.d/dsm_sa_ipmi stop
/etc/init.d/dsm_sa_ipmi start
/etc/init.d/dataeng start

Make sure it started nicely

tail -n 50 /var/log/messages

Test:

omreport chassis memory

Restart nrpe:

/etc/init.d/nrpe restart

1

tripleee · Answer 3 · 2016-12-21T00:34:00+08:00

tripleee

2016-12-21T00:34:00+08:002016-12-21T00:34:00+08:00

I ran into this on a host where a Nagios job was scheduled to check Openmanage. It would manifest as a large number of stale semaphores owned by Nagios.

I put in a nightly cron job to find the stale ones by simply taking two listings 10 minutes apart; anything present in both listings is assumed to be stale. (Adjust for your circumstances, obviously.)

nagioi () {
    ipcs -a | awk '$3 == "nagios" { print $2 }'
}

# Run two listings, 10 minutes apart
# The ones which are in both listings are definitely stuck
(nagioi; sleep 600; nagioi) |
sort | uniq -d |
xargs -n 1 -r -t ipcrm -s

1

Cristiano Ness · Answer 4 · 2018-04-28T03:47:21+08:00

Cristiano Ness

2018-04-28T03:47:21+08:002018-04-28T03:47:21+08:00

For this failed:

omreport chassis memory
Memory Information
Error : Memory object not found

Stop srvadmin-services.sh:

srvadmin-services.sh stop

The following command can be used to clear semaphores with the last-op parameter "Not set":

for i in `ipcs -st |grep "Not set"| cut -d ' ' -f1`; do (ipcrm -s $i); echo -e "$i clear."; done

Start srvadmin-services.sh:

srvadmin-services.sh start

1

Marcelo · Answer 5 · 2013-07-26T11:47:29+08:00

Marcelo

2013-07-26T11:47:29+08:002013-07-26T11:47:29+08:00

Try /etc/init.d/dataeng start and /etc/init.d/dsm_om_shrsvc start

-1

Anyone know how to fix issues with omsa on red hat 5.1 that reports "No Controllers found"?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?