The network configuration in my Xen setup is the following:
- the dom0 has 3 network cards (eth0, eth1, eth2), 3 brigdes (xenbrE, xenbrI, xenbrD) and each brigde integrates the corresponding network card. Only xenbrD has an IP address configured (192.168.78.2, a private LAN) so that it can discuss with all domU.
- there's a domU that is a firewall/router and it also contains 3 virtual cards (eth0, eth1, eth2). It does masquerading for traffic going out on eth0 (the external interface which is part of xenbrE).
My problem is that when I download a big file from the internet by HTTP in the dom0, the download rate is not stable. It goes up progressively and then stalls for a few seconds, and restart again going up progressively (and all this in loop until the download is complete). During the stalls, it looks all networking is blocked on the machine (noticed on interactive SSH sessions).
dom0 │domU
wget │
↕ │
eth2↔xenbrD(192.168.78.2)↔vif2.2←┼→eth2(192.168.78.1/24)
│ ↕ masquerading
eth0↔xenbrE↔vif2.0←——————————————┼→eth0(192.168.1.20/24)
↕
internet
If I do the same download but uses a (non-caching) HTTP proxy that runs in the firewall domU, the download rate is stable at its maximum value.
How can I avoid this problem?
I suspect it's a bug in the networking stack but I would like assistance to diagnose it more precisely (and maybe find a work-around).
This is a Debian Etch system with Xen 3.2 and the 2.6.26-xen-686 kernel of Debian Lenny (backports). The bridges are created with /etc/network/interfaces:
auto lo
iface lo inet loopback
auto xenbrE
iface xenbrE inet manual
bridge_ports eth0
bridge_maxwait 0
auto xenbrI
iface xenbrI inet manual
bridge_ports eth1
bridge_maxwait 0
auto xenbrD
iface xenbrD inet static
address 192.168.78.2
netmask 255.255.255.0
gateway 192.168.78.1
bridge_ports eth2
bridge_maxwait 0
The xend configuration is not complicated:
# grep '^(' /etc/xen/xend-config.sxp
(network-script network-dummy)
(vif-script vif-bridge)
(dom0-min-mem 150)
(dom0-cpus 0)
(vncpasswd '')
The Xen network setup of the domU is done with:
# grep vif /etc/xen/xm.slis
vif = [ 'mac=00:16:3e:14:85:11, bridge=xenbrE', 'mac=00:16:3e:14:85:12, bridge=xenbrI', 'mac=00:16:3e:14:85:13, bridge=xenbrD' ]
And the only routing in dom0 redirects to the domU via xenbrD:
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.78.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbrD
0.0.0.0 192.168.78.1 0.0.0.0 UG 0 0 0 xenbrD
In the domU, the only iptables configuration done is iptables -t nat -A POSTROUTING -s 192.168.78.0/24 -o eth0 -j MASQUERADE
.
really sounds like a Memory issue to me, this would explain way a local proxy helps, too. because it kind of stalls everything a little so maybe the Kernel can catch up handling the Packets. Maybe check this by giving Dom0 more memory. I got a similar setup here at work, and since we use it for speed measurements I'm greatly intressted in anything you find out about that (even though I don't experience the Problem here)
Maybe xen related. But could you check with another client besides dom0? Is another domU working? Could this be a problem in your NAT setup, like a mss/mtu problem?
This will happen if you are low on memory... Check your memory usage and also check your cpu usage. If you have a lot of io_wait then get more memory and allocate more to the dom0.