[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] all packets between certain guests on the same host being dropped





On 01/11/2017 10:15 AM, WebDawg wrote:


On Wed, Jan 11, 2017 at 9:13 AM, Sherrard Burton <sburton@xxxxxxxxxxxxx
<mailto:sburton@xxxxxxxxxxxxx>> wrote:



    On 01/11/2017 09:47 AM, WebDawg wrote:



        On Tue, Jan 10, 2017 at 2:19 PM, Sherrard Burton
        <sburton@xxxxxxxxxxxxx <mailto:sburton@xxxxxxxxxxxxx>
        <mailto:sburton@xxxxxxxxxxxxx <mailto:sburton@xxxxxxxxxxxxx>>>
        wrote:

            TL;DR
            all packets are being dropped in a debian 7 (wheezy) guest
        only when
            they are coming from a debian 5 (lenny) guest on the same
        host. the
            console and kernel log report  'net eth0: Invalid extra type: 4'
            when packets are being dropped. the problem goes away if i
        change
            wheezy configuration from 1 vcpu to >1 vcpu. i tested all of
        this on
            fresh, minimal installs, so AFAICT there are no firewalls or
        other
            esoteric settings involved.


            FULL VERSION
            this is a strange one, so please forgive me if i omit some
        useful
            details.

            intro:
            i have a pair of xen hosts which are running pairs of guest HA
            pairs. for example:

            host1
             \_apache-guest1
             |
             \_haproxy-guest1
             |
             \_appserver-guest1

            host2
             \_apache-guest2
             |
             \_haproxy-guest2
             |
             \_appserver-guest2

            with various HA solutions implemented within the guests.
        this is not
            germane to the particular problem, but germane to how i
        discovered
            it. for the sake of balancing, i have configured the guests' HA
            preferences so that the active nodes tend to be on different
        hosts.
            so under normal circumstances, apache-guest1 and haproxy-guest2
            would be the active nodes. no problem at all in that situation.

            but i discovered that i cannot communicate between
        apache-guest1 and
            haproxy-guest1, located on the same host. after much
        tcpdumping in
            the host and guests, i discovered that the problem is
        unidirectional
            and specific to a particular OS combination.

            a) inbound packets to a debian wheezy guest are dropped only
        when
            they originate from a debian lenny guest on the same host

            b) outbound packets from a wheezy guest to a lenny guest are
        passed
            correctly, even though the wheezy cannot see the return
            communication from the lenny guest

            c) there is no problem communicating to or from the wheezy
        guest and
            an identically-configured lenny guest on the other host

            d) there is no problem communicating to or from other
        combinations
            of guests on the same host. ie, from jessie to wheezy, lenny to
            lenny and wheezy to wheezy, etc.


            even stranger, my attempts in trying to narrow it down to the
            simplest possible test case led me to discover that for the same
            exact guest, changing the vcpu setting from 1 to >1 makes the
            problem go away.

            sburton@host:~$ virsh -c xen:/// dumpxml wheezy-guest >
            ~/cannot-ping.xml
            # test and reconfigure
            sburton@host:~$ virsh -c xen:/// dumpxml wheezy-guest >
        ~/can-ping.xml

            sburton@host:~$ diff ~/can-ping.xml ~/cannot-ping.xml
            6c6
            <   <vcpu placement='static'>2</vcpu>
            ---
            >   <vcpu placement='static'>1</vcpu>


            testing methodology:
            simple ping between hosts.

            initially broken because the ARP 'is-at' traffic from the lenny
            guest is dropped going into the wheezy guest, and ARP 'who-has'
            traffic from the lenny guest is dropped going into the
        wheezy guest.
            therefore the guests cannot discover one another.

            after manually setting the ARP cache entries on both guests:

            pinging from lenny to wheezy, tcpdump shows ICMP echo
        requests in
            the lenny guest and on the VIFs for both guests in the host.
        but the
            ICMP requests are unseen in the wheezy guest.

            pinging from wheezy to lenny, tcpdump shows ICMP echo
        requests and
            replies in the lenny guest and on the VIFs for both guests
        in the
            host. ICMP requests are seen in the wheezy guest, since they
            originate there, but the replies from the lenny guest are
        unseen.

            the problem is not limited to ARP or ICMP, all other
        communication i
            have tried fails similarly.

            the smoking gun (i hope):
            when packets are being dropped in the wheezy guest, the
        console and
            various logs report
            [ 6977.669408] net eth0: Invalid extra type: 4

            and the only reference i have found via my searching is this
        thread:

        
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00565.html
        
<https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00565.html>

        
<https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00565.html
        
<https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00565.html>>

            which seems to be unresolved.

            i'm hoping that some part of this tickles someone's memory, or
            piques their interest, or at least that someone can point me
        to some
            more troubleshooting steps i haven't thought of.

            TIA


            setup details:
            HOST:
            sburton@host:~$ cat /etc/issue
            Debian GNU/Linux 8 \n \l

            sburton@host:~$ uname -a
            Linux host 4.7.0-0.bpo.1-amd64 #1 SMP Debian 4.7.8-1~bpo8+1
            (2016-10-19) x86_64 GNU/Linux

            sburton@host:~$ dpkg -l | grep -F -e libvirt-daemon -e
            xen-hypervisor -e qemu-system
            ii  libvirt-daemon                  1.2.9-9+deb8u3
             amd64      programs for the libvirt library
            ii  libvirt-daemon-system           1.2.9-9+deb8u3
             amd64      Libvirt daemon configuration files
            ii  qemu-system-common              1:2.7+dfsg-3~bpo8+2
            amd64      QEMU full system emulation binaries (common files)
            ii  qemu-system-x86                 1:2.7+dfsg-3~bpo8+2
            amd64      QEMU full system emulation binaries (x86)
            ii  xen-hypervisor-4.4-amd64        4.4.1-9+deb8u8
             amd64      Xen Hypervisor on AMD64

            sburton@host:~$ grep -F -A1 '<os>' ~/cannot-ping.xml
              <os>
                <type arch='x86_64' machine='xenfv'>hvm</type>

            sburton@host:~$ grep -F -C2 'xenbr0' ~/cannot-ping.xml
                <interface type='bridge'>
                  <mac address='00:16:3e:fb:2e:1c'/>
                  <source bridge='xenbr0'/>
                  <model type='rtl8139'/>
                </interface>


            sburton@host:~$ ip addr show xenbr0
            8: xenbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
        noqueue
            state UP group default qlen 1000
                link/ether bc:30:5b:f0:32:b4 brd ff:ff:ff:ff:ff:ff
                inet 192.168.240.52/20 <http://192.168.240.52/20>
        <http://192.168.240.52/20> brd
            192.168.255.255 scope global xenbr0
                   valid_lft forever preferred_lft forever
                inet6 fe80::be30:5bff:fef0:32b4/64 scope link
                   valid_lft forever preferred_lft forever



            GUESTS:
            fullvirt installs, created from netinst ISO via virt-manager
        running
            on my workstation, manipulated through some combination of
            virt-manager and local virsh commands.

            root@wheezy-guest:~# uname -a
            Linux wheezy-guest 3.16.0-0.bpo.4-amd64 #1 SMP Debian
            3.16.36-1+deb8u2~bpo70+1 (2016-10-19) x86_64 GNU/Linux

            root@wheezy-guest:~# cat /etc/issue
            Debian GNU/Linux 7 \n \l

            root@lenny-guest:~# uname -a
            Linux lenny-guest 2.6.26-2-amd64 #1 SMP Sun Mar 4 21:48:06
        UTC 2012
            x86_64 GNU/Linux

            root@lenny-guest:~# cat /etc/issue
            Debian GNU/Linux 5.0 \n \l





        I do not know if this helps at all:
        
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00612.html
        
<https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg00612.html>


    that is the tail end of the "unresolved" thread i mentioned.

    i'm using the stock debian packages, and i have not poked around in
    the netfront driver, so i'm not intimately familiar with the
    suggested code changes. but i'm sure that i could insert some
    debugging and recompile given a little guidance.

    thanks for the response.



I am not familiar either.  I know w/ BSD there are issues w/ checksums
and such.  This seems to be a bug with the PV driver right?

that is my guess, based on the context provided by the other thread. but i am not at all positive, which is why i thought i'd put it to the list.




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.