We just completed complex network design for our new office. It has 2 ADSL routers connected to a Dual WAN Load Balancer router. Load Balancer is connected to 2 16-port Switches which connects 30 PCs. Also one 16-port switch is connected to another 16-port switch which in turn connects to the Load Balancer.
So my PC have logical path: PC >> SWITCH A >> SWITCH B [Optional] >> Load Balancer >> ADSL Modem [one of two available in network]
As I was facing some weird problems, I decided to run diagnose. My internet is working fine. Actually HTTP POSTs and FILE UPLOADS sometimes gets timed out.
Traceroute to external server (same output I get for Google/Facebook/etc). Number of hops remains 15.
[rtcamp@main ~]$ traceroute rtcamp.com
traceroute to rtcamp.com (70.32.85.76), 30 hops max, 60 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 rtcamp.com (70.32.85.76) 362.911 ms 364.550 ms 366.284 ms
Traceroute to Load Balancer router
[rtcamp@main /]$ traceroute 192.168.0.1
traceroute to 192.168.0.1 (192.168.0.1), 30 hops max, 60 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
My biggest problem is. We have created a public server for our subdomain like sub.example.com. Now sub.example.com works from outside world, but cannot be reached from network within.
I think if I can get normal traceroute output, things will be solved.
Any solution or idea?
Thanks,
-Rahul
Added on 10 September
Details of our network setup
- We have network of 192.168.0.x
- 192.168.0.1 is load balancer
- 192.168.1.1 is ADSL modem A
- Another ADSL modem is in bridge mode
- We have PC's from 192.168.0.2 to 192.168.0.50 (PC get IP address dynamically)
- 192.168.0.101/2 are for server in LAN. Its only one server with 2 LAN cards so 2 ip address.
- 192.168.0.200 is Wi-fi router and 192.168.0.201 onwards IP address are for laptops connected to wi-fi router. Wifi router gets LAN IP 192.168.0.100 from Load balancer as well on its ethernet interface.
1 thing, if you're using solaris based traceroute, you can do a traceroute -I rtcamp.com which will use icmp for the traceroute. We do this at work since UDP traceroute is blocked on our Firewall.
The other thing, you may have an ACL on you're WAN router, or a Firewall not mentioned, that is blocking the icmp time-exceeded message. If you allow these messages, at least internally, traceroute should work (and there is no risk to allowing this, only some types of icmp messages are bad).
As for the clients not being able to talk to the servers, are they on the same subnet, or are they on a seperate, possible secured network that is non-routable? It sounds like a rather simple network, but it sounds like you're WAN router might be specialized, and not to actual routing internally???
It looks to me like you have a switching loop. Do the switches run STP?
It looks and sounds like you may have a routing loop. Given that inbound access works, but outbound does not. I could be completely wrong, but that's what it seems like.
Try setting up an SSH connection to an external system. If it fails, or is very slow, you might have a routing problem.
I would also think about removing one of the routers from the load-balancer. If you can traceroute with only one router, you have a routing or load balancing issue.
Something is blocking ICMP. The blocking of ICMP time exceeded in transit messages is breaking traceroute. The blocking of ICMP fragmentation needed messages is breaking TCP. Most likely it's the load-balancing router.