I have a server mike at our office communicating with a remote server through a VPN. There have been various problems which seem to be linked to the MTU size. Mike is a RHEL 3 server and the customer server is CentOS 5. I used the tracepath tool to attempt to find the max MTU and got this weird result
[root@mike root]# tracepath 192.168.1.4
1: mike (192.168.100.1) 0.170ms pmtu 552
1: mike (192.168.100.1) 0.011ms pmtu 552
1: mike (192.168.100.1) 0.010ms pmtu 552
snip - thousands of lines of the same output
1: mike (192.168.100.1) 0.025ms pmtu 552
1: 192.168.100.252 (192.168.100.252) 0.405ms
2: 192.168.100.253 (192.168.100.253) 0.876ms
3: 192.168.1.4 (192.168.1.4) 97.1000ms reached
Resume: pmtu 552 hops 3 back 3
From another server at our office to a different customer I get a far more reasonable looking result
[root@nora ~]# tracepath 192.168.2.1
1: nora (192.168.100.228) 0.080ms pmtu 1500
1: 192.168.100.253 (192.168.13.253) asymm 2 0.813ms
2: no reply
3: 192.168.11.1 (192.168.11.1) 73.210ms reached
Resume: pmtu 1500 hops 3 back 3
So I think it is something to do with mike rather than the routers, firewalls or VPN inbetween. Any ideas?
In response to Daniel Lawson's answer below, here is why I believe that the problem is server mike as nora is able to get the right response
[root@nora ~]# tracepath 192.168.1.4
1: nora (192.168.100.228) 0.101ms pmtu 1500
1: 192.168.100.253 (192.168.100.253) asymm 2 0.863ms
2: no reply
3: 192.168.1.4 (192.168.1.4) 111.601ms reached
Resume: pmtu 1500 hops 3 back 3
In response to Mike Pennington's comment both 192.168.100.252 and 192.168.100.253 are firewalls. The default gateway is 192.168.100.252 which then has a static route for these customers to send the traffic to 192.168.100.253. As they are on the same network I assume that is why the hop count isn't incremented.
From what you've said, the path from the server "mike" to one customer is limited to 552, and a different path, from "nora" to a different customer, is not limited.
You're not comparing the same path, so unless there's more information then I doubt this is specific to the server "mike". The PMTU is the constrained MTU along the path, so if this was specific to mike then it should happen to any machine you tracepath to.
Given that you're using VPN connections, it's not really surprising that you're seeing a constrained PMTU though. I'd check the MTU settings for the VPN configuration to the first customer, and I'd also try checking the MTU direct to the customer's VPN end point (eg, the public IP address that terminates the VPN). One or the other of those is likely to have your answer.
It's quite common from Firewall and VPNs from different vendors - e.g. checkpoint to reduce the MTU below 1500 for some types of traffic.
If you have a Windows machine handy you can run mturoute in traceroute mode to determine which link/hop has the low MTU.
http://www.elifulkerson.com/projects/mturoute.php
i think the issue is related to juniper not returning an MTU value in the ICMP Unreachable packet that is part of the PMTUD. i've seen similar issue before with juniper