- Client : OpenSSH_5.1p1 Debian-5ubuntu1 (Ubuntu 9.04)
- Server : OpenSSH_5.1p1 Debian-5 (Proxmox 2.6.24-7-pve)
I use SSH to execute commands remotely on the server (module check_by_ssh of Nagios). But SSH hangs from time to time when trying to execute commands. I can log to the server via SSH but not executing a simple 'ls'. And it seems to block from all clients from the same IP address. Authentication is not the problem, may it be made by SSH keys or password.
ssh -l root -p 2222 server.domain.tld 'ls'
Here the client debug info
debug1: Entering interactive session.
debug2: callback start
debug2: client_session2_setup: id 0
debug1: Sending environment.
debug3: Ignored env ORBIT_SOCKETDIR
*** skipping approx 40 env var ignored
debug1: Sending command: ls
debug2: channel 0: request exec confirm 1
It hangs there. Then after a random time, it works again (without doing anything). Killing all sshd process on the server seems to work too. It works from a Putty. I saw that some people had trouble like this due to ISP reverse DNS problem, but it does not seem to be the case here.
It can work for hours and then not work for half an hour or so.
What could explain this behaviour ?
EDIT : Seems that with -t or -T option, ssh does not hang, but I can't pass one of these options in the check_by_ssh of nagios
I had the same problem, and today finally discovered what was causing the issue (for me at least). This might help you too.
When ssh is setting up a session, the DSCP flags field in the IP header is set to 0x0. If you establish an interactive session, it is set to 0x10 (16), and if you establish a non-interactive session, it is set to 0x8 (8). The ssh client sets the DSCP field with the setsockopt() system call (which I verified in the source)
A faulty configuration on a VPN at my employer was dropping the packets with the DSCP of 0x8, causing all non-interactive ssh traffic to also get dropped. To verify it was the DSCP field that was causing the drop, I used iptables on the ssh server to force the DSCP field to be set to 0x16 and tested my non-interactive traffic (ssh ls, same thing you were trying) and it worked after that. You might also try the same thing and see if thats why your session is hanging.
To set DSCP to 0x10 on all outgoing ssh traffic from your ssh server, run:
$ sudo iptables -t mangle -A OUTPUT -p tcp --sport 22 -j DSCP --set-dscp 0x19
This was on a rhel 6.5 box.
I got idea to resolve my problem from this blog. I also have very interesting problem
I got a L2vpn link (vendor provided MPLS L2) to connect my HO and branch office. all ping connectivity testings were working fine. When i ssh using debian server from HO to a debian server at client side i can log in to that server but after remotely ssh login to branch server i was unable to run ifconfig, htop or ps -ef commands. When i apply thoses commands the session freezes. Evn that i check it from windows pc using putty the result was same. Interesting thing is that when i use putty manager and ssh via that application from win 7 pc it was working fine. After reading this blog i got mpls mtu information from service provider and try the same scenario with different mtu size on source debian server interface at HO. Finally mtu sizes from 1440 to 1470 was working fine where as defaults mtu size 1500 was not working. Conclusion: the both end debian server's mtu size was default ie 1500 but in the mid way where service providers MPLS L2vpn mtu size was miss matching. thanks
You may be hitting an SSH rate limiter on the server-side network. This is a firewall technique to block IP addresses that have too many new connection requests within a short period of time. Then the source IP is blocked for a defined period of time.
Possibly an ICMP Path MTU Discovery issue.
In our cause, all ICMP Paramters were blocked by the firewall on the server side. Decreasing client side MTU (adviced by this text) solved the issue temporary. But after allowing all (but redirect) ICMP parameters at the server side, the problem gone.
I've experienced the same when having MTU problems. Using ciscos ipsec client-to-site, and then openvpn on top of that. Basically any packet with the size of 1500 bytes would freeze the session.
On my linux machine the VMWare network adapter vmnet2 MTU was set to 1500. The network is used for a virtual machine acting as internet gateway. Since changing the source from Wifi Internet to PPPoE Internet (fiber optics), remote ssh was not working anymore. After lowering vmnet2'S mtu to 1400 ssh remote commands to other ssh servers worked again.
Check the ssh on the server side. You may "strace" the created process / mail sshd process and see what syscalls is it calling. This should give you more info on what is it doing.
Also try "touch /tmp/randomfile" and see if the hangs happens after it get created or afterwards.
Have you checked to make sure there arent any PAM errors? just because it works from putty doesnt mean authentication isnt the problem.
I had similar problem. Both client's and server's MTU were 9000. After I lowered client's MTU to 1500 problem was gone.
A hang on "sending command" can be caused by SSH actually waiting for the keyphrase/password on the key. One can find out if this is the case by just taking off the command and SSH'ing into the server without a command at the end. It would then ask for a passphrase.