[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] all packets between certain guests on the same host being dropped





On 01/11/2017 05:26 PM, WebDawg wrote:


On Wed, Jan 11, 2017 at 11:02 AM, Sherrard Burton <sburton@xxxxxxxxxxxxx
<mailto:sburton@xxxxxxxxxxxxx>> wrote:



    On 01/10/2017 03:19 PM, Sherrard Burton wrote:

        TL;DR
        all packets are being dropped in a debian 7 (wheezy) guest only when
        they are coming from a debian 5 (lenny) guest on the same host. the
        console and kernel log report  'net eth0: Invalid extra type: 4'
        when
        packets are being dropped. the problem goes away if i change wheezy
        configuration from 1 vcpu to >1 vcpu. i tested all of this on fresh,
        minimal installs, so AFAICT there are no firewalls or other esoteric
        settings involved.


    i noticed that there is a backport kernel for lenny that includes
    the PV on HVM drivers. after upgrading to that kernel, the problem
    also goes away. i also confirmed the problem when the target is a
    debian 8 (jessie) guest.

    so it appears that the problem here is specific to HVM guests
    attempting to communicate with single-vcpu PV guests. if my analysis
    is correct, that would seem to imply that the root of the problem is
    in the packet-handling code in the host, no?



        FULL VERSION
        this is a strange one, so please forgive me if i omit some
        useful details.

        intro:
        i have a pair of xen hosts which are running pairs of guest HA
        pairs.
        for example:

        host1
         \_apache-guest1
         |
         \_haproxy-guest1
         |
         \_appserver-guest1

        host2
         \_apache-guest2
         |
         \_haproxy-guest2
         |
         \_appserver-guest2

        with various HA solutions implemented within the guests. this is not
        germane to the particular problem, but germane to how i
        discovered it.
        for the sake of balancing, i have configured the guests' HA
        preferences
        so that the active nodes tend to be on different hosts. so under
        normal
        circumstances, apache-guest1 and haproxy-guest2 would be the active
        nodes. no problem at all in that situation.

        but i discovered that i cannot communicate between apache-guest1 and
        haproxy-guest1, located on the same host. after much tcpdumping
        in the
        host and guests, i discovered that the problem is unidirectional and
        specific to a particular OS combination.

        a) inbound packets to a debian wheezy guest are dropped only
        when they
        originate from a debian lenny guest on the same host

        b) outbound packets from a wheezy guest to a lenny guest are passed
        correctly, even though the wheezy cannot see the return
        communication
        from the lenny guest

        c) there is no problem communicating to or from the wheezy guest
        and an
        identically-configured lenny guest on the other host

        d) there is no problem communicating to or from other
        combinations of
        guests on the same host. ie, from jessie to wheezy, lenny to
        lenny and
        wheezy to wheezy, etc.


        even stranger, my attempts in trying to narrow it down to the
        simplest
        possible test case led me to discover that for the same exact guest,
        changing the vcpu setting from 1 to >1 makes the problem go away.

        sburton@host:~$ virsh -c xen:/// dumpxml wheezy-guest >
        ~/cannot-ping.xml
        # test and reconfigure
        sburton@host:~$ virsh -c xen:/// dumpxml wheezy-guest >
        ~/can-ping.xml

        sburton@host:~$ diff ~/can-ping.xml ~/cannot-ping.xml
        6c6
        <   <vcpu placement='static'>2</vcpu>
        ---

              <vcpu placement='static'>1</vcpu>



        testing methodology:
        simple ping between hosts.

        initially broken because the ARP 'is-at' traffic from the lenny
        guest is
        dropped going into the wheezy guest, and ARP 'who-has' traffic
        from the
        lenny guest is dropped going into the wheezy guest. therefore
        the guests
        cannot discover one another.

        after manually setting the ARP cache entries on both guests:

        pinging from lenny to wheezy, tcpdump shows ICMP echo requests
        in the
        lenny guest and on the VIFs for both guests in the host. but the
        ICMP
        requests are unseen in the wheezy guest.

        pinging from wheezy to lenny, tcpdump shows ICMP echo requests and
        replies in the lenny guest and on the VIFs for both guests in
        the host.
        ICMP requests are seen in the wheezy guest, since they originate
        there,
        but the replies from the lenny guest are unseen.

        the problem is not limited to ARP or ICMP, all other communication i
        have tried fails similarly.

        the smoking gun (i hope):
        when packets are being dropped in the wheezy guest, the console and
        various logs report
        [ 6977.669408] net eth0: Invalid extra type: 4

        and the only reference i have found via my searching is this thread:
        
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00565.html
        
<https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00565.html>

        which seems to be unresolved.

        i'm hoping that some part of this tickles someone's memory, or
        piques
        their interest, or at least that someone can point me to some more
        troubleshooting steps i haven't thought of.

        TIA


        setup details:
        HOST:
        sburton@host:~$ cat /etc/issue
        Debian GNU/Linux 8 \n \l

        sburton@host:~$ uname -a
        Linux host 4.7.0-0.bpo.1-amd64 #1 SMP Debian 4.7.8-1~bpo8+1
        (2016-10-19)
        x86_64 GNU/Linux

        sburton@host:~$ dpkg -l | grep -F -e libvirt-daemon -e
        xen-hypervisor -e
        qemu-system
        ii  libvirt-daemon                  1.2.9-9+deb8u3
         amd64
             programs for the libvirt library
        ii  libvirt-daemon-system           1.2.9-9+deb8u3
         amd64
             Libvirt daemon configuration files
        ii  qemu-system-common              1:2.7+dfsg-3~bpo8+2
        amd64
             QEMU full system emulation binaries (common files)
        ii  qemu-system-x86                 1:2.7+dfsg-3~bpo8+2
        amd64
             QEMU full system emulation binaries (x86)
        ii  xen-hypervisor-4.4-amd64        4.4.1-9+deb8u8
         amd64
             Xen Hypervisor on AMD64

        sburton@host:~$ grep -F -A1 '<os>' ~/cannot-ping.xml
          <os>
            <type arch='x86_64' machine='xenfv'>hvm</type>

        sburton@host:~$ grep -F -C2 'xenbr0' ~/cannot-ping.xml
            <interface type='bridge'>
              <mac address='00:16:3e:fb:2e:1c'/>
              <source bridge='xenbr0'/>
              <model type='rtl8139'/>
            </interface>


        sburton@host:~$ ip addr show xenbr0
        8: xenbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
        state UP group default qlen 1000
            link/ether bc:30:5b:f0:32:b4 brd ff:ff:ff:ff:ff:ff
            inet 192.168.240.52/20 <http://192.168.240.52/20> brd
        192.168.255.255 scope global xenbr0
               valid_lft forever preferred_lft forever
            inet6 fe80::be30:5bff:fef0:32b4/64 scope link
               valid_lft forever preferred_lft forever



        GUESTS:
        fullvirt installs, created from netinst ISO via virt-manager
        running on
        my workstation, manipulated through some combination of
        virt-manager and
        local virsh commands.

        root@wheezy-guest:~# uname -a
        Linux wheezy-guest 3.16.0-0.bpo.4-amd64 #1 SMP Debian
        3.16.36-1+deb8u2~bpo70+1 (2016-10-19) x86_64 GNU/Linux

        root@wheezy-guest:~# cat /etc/issue
        Debian GNU/Linux 7 \n \l

        root@lenny-guest:~# uname -a
        Linux lenny-guest 2.6.26-2-amd64 #1 SMP Sun Mar 4 21:48:06 UTC 2012
        x86_64 GNU/Linux

        root@lenny-guest:~# cat /etc/issue
        Debian GNU/Linux 5.0 \n \l




By target you mean debian jessie as a guest right?  Looking at your
logs, are you running etch as the dom0?

yes, the last test case was debian 8 (jessie) as a guest.

the host is running jessie, and the connections fail from HVM guests to single-vcpu PV guests on the same host.





_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.