[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unexpected high dom0 load for bridges, especially with VLAN tag

I've been working on cutting down the number of "little" boxes here and
rebuilt a perimeter firewall and interior router/firewall on Xen 4.1.1
running Debian Buster dom0 on an Intel i3-7100T
(c. 2017, 2 cores, 4 threads, 3.4 GHz)
with a dual-port, Intel PCI NIC (believed genuine)
in addition to the onboard NICs.


I'm seeing ~140-150% dom0 load in xentop when passing ~250 Mbit/s of
packets between the two domUs on a dedicated, two-port Open vSwitch

This seems excessive for what should be "just a wire" between the two
(other traffic for them is on PCI pass-through of the Intel NICs).

Taking out the function of these domUs out of the picture, bringing up
two "fresh" two Debian Buster domUs and iperf3 still shows seemingly
high load, especially if VLAN tags are involved. This is seen with
Open vSwitch or Linux bridges:

Without VLAN tag

    at 300 Mbits/s    ~18% dom0 load
    at 1000 Mbits/s   ~40% dom0 load

With VLAN tagging/detagging from the domU interfaces

    at 300 Mbits/s    ~ 40% dom0 load
    at 1000 Mbits/s   ~115% dom0 load

As there are only two ports on the bridge and two MAC addresses
involved, this seems high. No bridge filtering is configured.

It is especially surprising that using a single, consistent VLAN tag
"on the wire" doubles or triples the load.

This is reasonably consistent for Open vSwitch, Linux bridge set up
with Debian /etc/network/interfaces config, created with `brctl`, or
created with `ip link add ... type bridge`

Is this kind of load expected?

Is there any configuration of either style bridge that might
significantly improve this?

(At least for now, I need to stick with tagging the VLAN as I'm trying
to unravel why running without the tag causes some throughput problems
with the domUs involved.)

More detail:

xen 4.1.1
ovs 2.10.1
Linux xen-i3 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux

The interior router uses one of the Intel card's NICs via PCI
pass-through to connect to the Cisco SG300 switch for "inside" access
to VLAN trunks. It is running FreeBSD 12.1 in HVM mode.

The perimeter firewall uses the other of the Intel card's NICs to
connect through the Cisco to the DOCSIS modem. The Comcast line is
good for ~250 Mbps down and ~10 Mbps up. It is running Debian Buster
in PV mode, booted through grub-x86_64-xen.custom.bin (to recognize
the ZFS file system on which it runs).

The two are connected through a dedicated, two-port Open vSwitch
bridge, with the same VLAN tag they were running with when the two
functions each had their own, physical hardware.

When running a bandwidth test from a local host to a remote server,
the outbound packet path, as I understand it is:

    Interior host sources
    Interior host sends via Cisco SG300
    Cisco SG300 forwards to Intel NIC "0" on PCI pass-through to wildside (interior)
    Wildside processes, routes over VIF pair, tagged
    Received on other end of pair by dom0
    Packet bridged by dom0
    Packet goes out another VIF pair to front (perimeter)
    Front receives packed at other end of VIF pair
    Front routes packet out Intel NIC "1" on PCI pass-through
    Cisco SG300 forwards packet to the modem

Examining htop on dom0 under load shows truncated names that appear to
be queues, three or four associated with each of the two, involved VIFs.

No special configuration of kernel governor, CPU affinity, or the like
has been done on dom0 or any of the domUs.

I've run them both tagged, and was working to cut over untagged on
both, but have run into a dribble of throughput when I do. as that
involves a non-Linux domU, I'll work through that in another thread.

The current xl config has front untagged and wildsdie still tagged.

Front (permieter router)

vif = [
'script=vif-openvswitch,type=vif,vifname=front-zfs_xn0,bridge=ovsbr0:<mgmt VLAN>:<other VLAN>', 'script=vif-openvswitch,type=vif,vifname=front-zfs_xn1,bridge=ovsbr1.<link VLAN>',

pci = [

Wildside (interior router)

vif         = [
'script=vif-openvswitch,type=vif,vifname=wildside_xn0,bridge=ovsbr1:<link VLAN>',

pci = [

The VIF names seem to be within the typical 15-character limit.

This was previously running on an AMD GX-412TC (4 core, 1 GHz) and a
Celeron J1900 (4 core, 2 GHz).

The i3-7100T and the NICs on its Intel card have been used to
benchmark networking at up to GigE symmetric rates.

I've tried to "direct wire" the two domUs with specifying a backend
for the VIF in the xl config. Though I am surprised that the VIF pair
and a two-port bridge are apparently so CPU hungry, even at low
speeds, such a connection would seem to simplify things by removing
one VIF pair and the bridge entirely.

Even if that were possible, it still leaves me with concerns around
using VLAN trunking and its apparent impact on CPU load. This all came
about as suricata was the next service I was going to try to move to
the Xen box.


Jeff Kletsky



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.