[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"



Wednesday, March 12, 2014, 12:35:22 PM, you wrote:

> On Wed, Mar 12, 2014 at 02:42:43AM +0100, Sander Eikelenboom wrote:
> [...]
>> Ok you asked for it .. so here we go .. ;-) :
>> 
>> - Xen-unstable
>> - DomU:
>>      - PV guest
>>      - Debian Wheezy
>>      - Kernel: 3.14-rc5 (.config see dom0 .config)
>>      - Running netserver
>> - Dom0:
>>      - Debian Wheezy
>>      - Kernel: 3.13.6 + additional patches (see attached git log and .config)
>>      - 2 physical NIC's
>>      - Autoballooning is prevented by using dom0_mem=1536M,max:1536M for xen 
>> and mem=1536M for dom0 kernel stanzas in grub
>>      - vcpu 0 is pinned on pcpu 0 and exclusively for dom0
>> 
>> - Networking:
>>      - Routed bridge
>>      - eth0 = internet
>>      - eth1 = lan 172.16.x.x
>>      - xen_bridge = bridge for VM's 192.168.1.x
>>      - iptables NAT and routing  
>>      - attached dom0 and domU ifconfig output
>>      - attached ethtool -k output for the bridge, vif and guest eth0
>> 
>> Triggering workload:
>>      - Well that's were the problem is :-)
>>      - The Guest has a normal disk and swap (phy/lvm) and
>>        shared storage with glusterfs (glusterfs server is on dom0)
>>      - The Guest exposes this storage via webdavs 
>> 
>>      What triggers it is:
>>      - The Guest runs it's rsync of the shared storage to a remote client on 
>> the internet
>>        So this causes traffic from Dom0 to domU (reading storage) ..
>>        back from domU to dom0 and via iptables NAT on to eth0 (actual rsync)
>>        or vice versa when syncing the other way around
>>      - At the same time do a backup from a windows machine from the lan to 
>> the webdavs server
>>        So this causes traffic from eth1 to domU (webdav) .. 
>>        back from domU to dom0 (writing to the shared storage) 
>> 
>> So in essence it is doing quite some netback && netfront stress testing :-)
>> 
>> It seems to only trigger when doing both the rsync and the webdav 
>> simultaneous.
>> 
>> I tried my best to emulate any of this with netperf (and multiple instances),
>> i tried with various (odd) packet sizes and the packet / byte rates 
>> transmitted
>> are higher then with the workload above ... but it doesn't seem to trigger 
>> with netpref
>> 
>> So i don't think it will be easy to replicate ...
>> 

> Sorry, this is probably not something I can setup. *sigh*

>> Perhaps running through the available logging again .. and try to answer 
>> some questions ... this is just with one guest running kernels as before 
>> only added debugging to netfront and xen (diffs attached):
>> 
>> Mar 12 02:00:44 backup kernel: [  496.840646] net eth0: rx->offset: 0, size: 
>> 4294967295

> This "size" here is actually "status", status=-1, it's an error...

>> Mar 12 02:00:44 backup kernel: [  496.840665] net eth0: cons:1346005 slots:1 
>> rp:1346013 max:18 err:0 rx->id:212 rx->offset:0 size:4294967295 ref:572 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.840680] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.840687] net eth0: cons:1346005 slots:2 
>> rp:1346013 max:18 err:-22 rx->id:214 rx->offset:0 size:4294967295 ref:657 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.840701] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.840712] net eth0: cons:1346005 slots:3 
>> rp:1346013 max:18 err:-22 rx->id:215 rx->offset:0 size:4294967295 ref:667 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.840733] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.840740] net eth0: cons:1346005 slots:4 
>> rp:1346013 max:18 err:-22 rx->id:216 rx->offset:0 size:4294967295 ref:716 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.840757] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.840764] net eth0: cons:1346005 slots:5 
>> rp:1346013 max:18 err:-22 rx->id:217 rx->offset:0 size:4294967295 ref:755 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.840778] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.840784] net eth0: cons:1346005 slots:6 
>> rp:1346013 max:18 err:-22 rx->id:218 rx->offset:0 size:4294967295 ref:592 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.840801] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.840807] net eth0: cons:1346005 slots:7 
>> rp:1346013 max:18 err:-22 rx->id:219 rx->offset:0 size:4294967295 ref:633 
>> pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 gso_type:0 gso_segs:0 
>> RING_HAS_UNCONSUMED_RESPONSES:9 cons_changed:1 cons_before:1346004 
>> xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.841043] net eth0: rx->offset: 0, size: 
>> 4294967295
>> 
>> Mar 12 02:00:44 backup kernel: [  496.841052] net eth0: cons:1346025 slots:1 
>> rp:1346038 max:18 err:0 rx->id:232 rx->offset:0 size:4294967295 
>> ref:-131941395332491 pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 
>> gso_type:0 gso_segs:0 RING_HAS_UNCONSUMED_RESPONSES:13 cons_changed:1 
>> cons_before:1346024 xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.841067] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.841074] net eth0: cons:1346025 slots:2 
>> rp:1346038 max:18 err:-22 rx->id:234 rx->offset:0 size:4294967295 
>> ref:-131941395332579 pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 
>> gso_type:0 gso_segs:0 RING_HAS_UNCONSUMED_RESPONSES:29 cons_changed:1 
>> cons_before:1346024 xennet_get_extras_err:0
>> Mar 12 02:00:44 backup kernel: [  496.841092] net eth0: rx->offset: 0, size: 
>> 4294967295
>> Mar 12 02:00:44 backup kernel: [  496.841101] net eth0: cons:1346025 slots:3 
>> rp:1346038 max:18 err:-22 rx->id:235 rx->offset:0 size:4294967295 
>> ref:-131941395332408 pagesize:4096 skb_ipsummed:0 is_gso:0 gso_size:0 
>> gso_type:0 gso_segs:0 RING_HAS_UNCONSUMED_RESPONSES:29 cons_changed:1 
>> cons_before:1346024 xennet_get_extras_err:0
>> 
>> 
>> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
>> 4325377 gt_version:1 ldom:0 readonly:0 allow_transitive:1
>> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
>> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5478146 
>> source_off:0 source_len:4096 op->source.offset:0 op->len:1168

> rc = -3 when acquiring grant. This means grant reference is unrecognised
> or inappropriate. The grant reference is 4325377 which is obviously too
> large for a sensible grant ref. It's just garbage.

>> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
>> 19791875 gt_version:1 ldom:0 readonly:0 allow_transitive:1
>> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
>> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5497610 
>> source_off:0 source_len:4096 op->source.offset:0 op->len:2476
>> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
>> 4325379 gt_version:1 ldom:0 readonly:0 allow_transitive:1
>> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
>> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5478282 
>> source_off:0 source_len:4096 op->source.offset:0 op->len:1634
>> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
>> 4325379 gt_version:1 ldom:0 readonly:0 allow_transitive:1
>> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
>> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5497610 
>> source_off:0 source_len:4096 op->source.offset:1634 op->len:1620
>> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
>> 4325377 gt_version:1 ldom:0 readonly:0 allow_transitive:1
>> (XEN) [2014-03-12 01:00:44] grant_table.c:2100:d0v2 acquire_grant_for_copy 
>> failed .. dest_is_gref rc:-3 source.domid:32752 dest.domid:1 s_frame:5497609 
>> source_off:0 source_len:4096 op->source.offset:0 op->len:4096
>> (XEN) [2014-03-12 01:00:44] grant_table.c:1856:d0v2 Bad grant reference 
>> 19791875 gt_version:1 ldom:0 readonly:0 allow_transitive:1
>> 
>> - Sometimes (but not always) netfront also spits out:
>>   dev_warn(dev, "Invalid extra type: %d\n", extra->type);
>>   where the extra type seems a random value (seen 196, 31 ..)
>> - Sometimes (but not always) netfront also spits out:
>>   dev_warn(dev, "Need more slots\n");
>> - Sometimes (but not always) netfront also spits out:
>>   dev_warn(dev, "Missing extra info\n");
>> 

> That's because garbage is pushed to frontend. It's trying to decode some
> random values.

>> 
>> First question that comes to my mind:
>> - Are the warnings netfront spits out the cause of xen reporting the bad 
>> grant reference ?
>>   Or
>>   Are the Bad grant references Xen is reporting .. causing netfront to spit 
>> out the warnings ?

> No. They are not directly connected.

> 0. backend breaks down a skb into slots.
> 1. backend gets a request from the rx ring (a slot with gref provided by
>    frontend), until all slots in skb are handled.
> 2. backend grant-copies data to that slot (with gref).
> 3. backend pushes a response to frontend, whose content depends on the
>    result of previous step.
> 4. frontend handles that response.

> So the grant copy error in hypervisor you're seeing is in 2. The
> complaint from frontend is in 4, when backend pushes a response with
> error status in it. The bug is most likely in 0, when the calculation
> goes wrong - either in the breakdown process, or in the macro that
> returns usable slots (this can lead to backend thinks there's enough
> slots while there's not).

Ok so perhaps a debug patch which does more sanity checking there ?
Because i don't see any errors/warnings from netback in dom0.

>> - Why is that "if (rx->flags & XEN_NETRXF_extra_info) {}" part in 
>> xen-netfront.c doing there and changing cons midway ?
>>   The commit message from f942dc2552b8bfdee607be867b12a8971bb9cd85 that 
>> introduced the if says:
>> 
>>     One major change from xen.git is that the guest transmit path (i.e. what
>>     looks like receive to netback) has been significantly reworked to remove
>>     the dependency on the out of tree PageForeign page flag (a core kernel
>>     patch which enables a per page destructor callback on the final
>>     put_page). This page flag was used in order to implement a grant map
>>     based transmit path (where guest pages are mapped directly into SKB
>>     frags). Instead this version of netback uses grant copy operations into
>>     regular memory belonging to the backend domain. Reinstating the grant
>>     map functionality is something which I would like to revisit in the
>>     future.
>> 
>>   It *is* using grant copy now .. so should this part have been removed ?
>>   And
>>   Could Paul's commit that seems to be the first to trigger these events 
>> affect this ?
>> 

> This depicts guest *transmit* path, but the issue you're seeing is in
> guest *receive* path. So that's totally different thing.

Errr yes well if the (rx->flags & XEN_NETRXF_extra_info) {}) should only be 
doing something on the transmit path ..
why does it seem to result to true on my issue ?  (see cons_changed=1 in the 
debub output.. and you see cons being cons + 1 after that)


> Wei.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.