smf questions - Page 1

David Mackintosh

Asked: 2011-11-27 20:00:56 +0800 CST

What part of SMF is likely broken by a hard power down?

2

At one of my customer sites, the local guy shut down their local Solaris 10 x86 server, pulled the power inputs, moved it, and now it won’t start properly. It boots and then presents a prompt which lets you log in. This appears to be single user milestone (or equivalent).

Digging into it, I think that SMF isn’t permitting the system to go multi-user. SMF was generating a ton of errors on autofs, after some fooling with it I got it to generate errors on inetd and nfs/client instead. This all tells me that the problem is in some SMF state file or database that needs to be fixed/deleted/recreated or something, but I don’t know what the actual issue is.

By “generate errors”, I mean that every second I get a message on the console saying “Method or service exit timed out. Killing contract <#>.” This makes interacting with the computer difficult.

Running svcs –xv shows the service as “enabled”, in state “disabled”, reason “Start method is running”. Fooling with svcadm on the service does nothing, except confirm that the service is not in a Maintenance state.

Logs in /lib/svc/log/$SERVICE just tell you that this loop has been happening once per second. Logs in /etc/svc/volatile/$SERVICE confirm that at boot the service is attempted to start, and immediately stopped, no further entries. Note that system-log isn’t starting because system-log depends on autofs so I have no syslog or dmesg.

Googling all these terms ends up telling me how to debug/fix either autofs or nfs/client or inetd or rpc/gss (which was the dependency that SMF was using as an excuse to prevent nfs/client from “starting”, it was claiming that rpc/gss was “undefined” which is incorrect since this all used to work. I re-enabled it with inetadm, but inetd still won’t start properly). But I think that the problem is SMF in general, not the individual services.

Doing a restore_repository to the “manifest_import” does nothing to improve, or even detectibly change, the situation. I didn’t use a boot backup because the last boot(s) were not useful.

I have told the customer that since the valuable data directories are on a separate file system (which fsck’s as clean so it is intact) we could just re-install solaris 10 on the / partition. But that seems like an awfully windows-like solution to inflict on this problem.

So. Any ideas what piece is broken and how I might fix it?

Update 1: I should probably mention that this system has two file systems, / and /export. Both fsck clean and mount properly.

700 Software

Asked: 2011-07-02 13:49:27 +0800 CST

Solaris SMF: Kill with custom signal, or get PID, or prevent kill of children

2

In the Solaris Service XML

I am using a kill to signal a graceful shutdown

<exec_method type="method" name="stop" exec=":kill" timeout_seconds="60" />

This works great, except for the fact that it also kills the child processes, which mostly just die after a SIGTERM. Any of these will work

Get the PID so I can use exec="kill -SIGUSR1 $PID"
Prevent SIGTERM from being sent to the children. (or at least not the grandchildren)
Use some other signal

I would prefer not to set up a separate script that has to go figure out the pid. I will do this if I have to. I would prefer to get it from an environment variable, or use a SMF built in command.

webjay

Asked: 2011-04-15 00:53:18 +0800 CST

Too many open files

2

In php-fpm.conf I have:

rlimit_files = 8192

My server is a 1G SmartMachine from Joyent, meaning it is a Solaris with 1GB memory.

My problem is that on high load I get errors like this:

Warning (2): touch() [function.touch]: Unable to create file app/tmp/cache/persistent/cake_core_users_da because Too many open files in [cake/libs/file.php, line 125]

Is my rlimit_files too low, and if so how high should I set it?

aaa90210

Asked: 2010-08-27 14:17:00 +0800 CST

Solaris SMF kill service because child dies

6

I am using SMF to manage a service under Solaris10.

This service is itself a process manager, and forks off many child processes, some of which die occasionally (or are killed for various reasons). The service process itself is very robust and never dies however.

The problem I have is that when I manually kill one of these child processes using the KILL signal, SMF will restart the main service:

[ Aug 27 08:07:06 Stopping because process received fatal signal from outside the service. ]

Is there a way I can configure SMF or the service manifest such that SMF will not kill the service if one of the service sub-processes gets killed?

TIA

Phillip B Oldham

Asked: 2010-04-01 04:24:24 +0800 CST

SMF restarting service whenever there's output?

2

I'm trying to add a custom service to SMF's configuration, which seems successful in that the service starts and there is a log file, but therein lies the problem; the service, on start-up, prints some logging messages to the stderr. It seems that SMF is seeing those messages and, believing them to be errors, restarts the service, giving up after a number of tries and leaving the service off.

Here's part of the log output:

[ Mar 30 14:59:54 Enabled. ]
[ Mar 30 14:59:54 Executing start method ("java server.CustomServer"). ]
Starting server...
[ Mar 30 15:00:04 Method or service exit timed out.  Killing contract 107. ]

Running the server directly on the commandline is fine, and AFACS there are no errors being encountered during startup, other than the output.

What would be the best way to manage this service with SMF? The logging is needed for diagnosing problems, and would be problematic to disable. Is it possible to configure this service to only restart if the service exists?

What part of SMF is likely broken by a hard power down?

Solaris SMF: Kill with custom signal, or get PID, or prevent kill of children

Too many open files

Solaris SMF kill service because child dies

SMF restarting service whenever there's output?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Questions[smf](server)