I'm a Linux guy trying to save a Windows guy from getting on a plane to check on a server that "appears to not be running".
The setup
- Windows server on a remote network with IP 10.55.0.56 (you don't know anything else about it)
- You can get** TCP port forwarding from a Windows workstation in your office to any port on 10.55.0.56
- You do not have access to any Linux/Unix machines on the remote network.
** There is a piece of commercial hardware on the remote network that can be used for port forwarding. It has a clunky web interface with several steps required to add each port. Doing a full port scan is unfeasible.
The question
- How do you check to see if it is alive?
The most sure way to verify that any server is running is to sit down in front of it and wiggle the mouse. But the good news here is that anybody can do that. I'm not sure where you're keeping this server, but there must be somebody who can physically check it before getting on a plane. If you're an MSP and it's a customer's server, your customer should have physical access to the server. If it's at a colocation, then your service agreement specifies that somebody is there with physical access to the server.
But barring that, let's start with the basics.
Can you ping the server? A ping from outside the remote network to the public IP address will tell you if the internet is up and the remote firewall is working. A ping from inside the remote network will tell you if the intranet is up and if the server is responding. If you can't ping it, then the server or the network is down. (Note that some administrators will block ICMP and some versions of Windows ship with ICMP blocked by the Windows firewall, which means ping will always fail. While usually done in the interest of security, it's a silly decision, IMO. That's like trying to change your tires without a lug wrench.)
Can you connect to the specific ports used by your application? In most cases, forwarded ports are forwarded ports, so unless you're doing layer 3 packet inspection and filtering to allow only specific protocols, you can remotely test if there is a service listening on a specific port. You can use a telnet session to those ports (eg
telnet localhost 25
to test SMTP). You can simplify port tests with an app such as CryPing that does a telnet-like test in a ping-like format.If your ping tests fail, you can sometimes use Trace Route to identify the point of failure. Trace Route essentially sends a "ping" to each router along the path from your workstation to the destination IP. This isn't always accurate, however, because a lot of ISPs will block ICMP and your trace will stall. As mentioned earlier, if ICMP is blocked by the destination server's software firewall, ping will also fail.
If your ping tests succeed, then you need to get access to the server. Note that sometimes a server will respond to ping even if the shell has crashed and is non responsive.
Can you open a remote shell to the remote host? For a Windows server, if port 3389 is forwarded and Remote Desktop is enabled, you can connect with a Remote Desktop client. If you have access to a local Windows workstation on the remote network, you can try Remote Desktop from that workstation or also try remote Powershell or PSExec from sysinternals.
Is SNMP enabled? Many businesses use SNMP for monitoring and reporting on system health status. You can use free SNMP tools such as Paessler SNMP Tester or a generally not-free SNMP monitoring suite such as Solar Winds or WhatsUP to check on the status of the remote server. For Windows servers, SNMP gets its information through WMI, so if SNMP is enabled and returns results, then the server is definitely up and responsive.
Do you have remote out-of-band management? This is usually in the form of an Integrated Dell Remote Access Controller (Dell iDrac) or HP Integrated Lights Out (HP iLO) controller. These devices provide remote access to the status of a server and, with the more costly enterprise versions, even allow remote control of a server. They can generally only be accessed from the local network and require unique root administrative credentials (though they can be integrated with Active Directory for identity management). Note that if you have an iDRAC or iLO controller, they will usually have a unique IP address assigned to them. If you can ping the iDRAC or iLO but not the server, then you know the network is up and the server is down. (Except some larger enterprises that, for security reasons, put their out of band management on a separate management network with limited access.)
In the case of a virtual server, do you have access to the VM hypervisor? This would usually be in the form of VMWare vSphere or vCenter or Microsoft Hyper-V Manager. (I hear some people out there still use Citrix XenServers for hypervisors...) If you have access to the hypervisor, then you have exactly the same as physical access to the server and you don't have to get on a plane. In the case of the latest versions of VMWare ESXi, you can even manage the hypervisor via a browser. And if it's not a virtual server, ask yourself why not?
A few things you'll want to consider. Just because your application no longer connects to the destination server on those ports does not mean the destination server is down or even that the destination network firewall is blocking it. Your ISP may be blocking those ports and it may be your local network firewall or even your OS firewall blocking the application's access. Don't be afraid to try all of the above tests from different local networks (your office and maybe your home network too).
The server may be up and running fine, but the application services that it hosts may have crashed or hung. In this case, the server itself will respond to ping and other tests, but the ports your application uses will be closed.
You mentioned that there is no remote access to *nix workstations at the remote network, but do you have remote access to Windows workstations? Even a Mac will give you a local workstation from which you can test ping, trace route, services, and even Remote Desktop.
Is the switch it's connected to a managed switch? If so, and if you can log in to the CLI of that switch, you can check the status of the interface the server is connected on. If the interface shows down or disconnected, then the server is offline, the network cable is failed or unplugged, or the switch port has failed - all of which will require a flight plan if there is no qualified technician on site.
Do you have remote access to the firewall? All firewalls have basic diagnostic functions such as ping and trace route. You can use these to test the remote server on its local network without having access to a workstation on the remote network. Some will even allow telnet or SSH, allowing you to connect from one firewall to the routers and switches on the remote network. If your firewall does not have these functions in the GUI and does not have a CLI that you can access, get your buddy on the plane - not to check the server, but to replace the POS firewall you're using out there. While he's out there, he can fix the server problem, P2V it, install ESXi or Hyper-v, and forward remote hypervisor management and out-of-band iDRAC or iLO so you don't have to go through this again.
Rereading over this and I realize there are a few points I should add.
If you have access to the firewall to do port forwarding, you can temporarily forward ICMP traffic to the remote server, allowing you to run ping tests directly to it. You can also forward any other port to test various Windows services over the WAN.
Also, nearly EVERY Dell or HP server comes with a basic out of band management interface that will at least let you power off and reboot the remote server. Whiteboxes are less consistent with OOB management.
IF the remote server has Windows Remote Management (WinRM) enabled, you can potentially manage the Services MMC remotely, though there are a lot of security hoops to jump through for WinRM to work. If you're CLI savvy, you can possibly use PSTools to perform the same functions - i.e. restarting services - but there are still security hoops to jump through for this to work. Both of these options would require setting up the appropriate port forwards on the firewall.
The problem in this case is that just knowing the server is powered on doesn't help you. In all likelihood even if Windows is up and running fine, the lack of access to application services means your application has crashed and the lack of remote access to the server means you have no way to fix your application.
Assuming the application services run at startup, the simplest solution in the case of an inaccessible remote headless server is to have someone on site power cycle it. But this specific case seems like a perfect storm of poor design decisions (ostensibly made by someone else) leading to someone getting on a plane.