We run a simple deployment script remotely using a command like ssh [email protected] sudo /root/run-chef-client.sh
. It started to hang today because sshd
waited forever on the 10.170.4.11
even after sudo
had finished already. We started sshd
in debug mode and got two different kind of logs. The following is a normal log when the session does not hang:
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 23187
debug1: session_exit_message: session 0 channel 0 pid 23187
debug1: session_exit_message: release channel 0
Received disconnect from 10.170.4.6: 11: disconnected by user
And when it hangs we get the following:
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 24209
debug1: session_exit_message: session 0 channel 0 pid 24209
debug1: session_exit_message: release channel 0
Our understanding is that the server process waits for some communication from a client side and never gets it. It's hard to tell if it is a client side or a server side problem.
We tried to run sshd
under strace
but did not succeed because a SUID bit on sudo
was ignored it this case. So, what else should we try to debug/prevent this situations?
Using
ssh -t
(forced PTY allocation) on a client side solved the problem:sshd
is controlled by a pseudo TTY not by a client anymore.