[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] netfront.c: gnttab_query_foreign_access returns nonzero in network_tx_buf_gc
>>> On Thu, May 25, 2006 at 10:37 AM, in message <04b301c68019$96989e80$0302a8c0@Violet>, "Steven Hand" <steven.hand@xxxxxxxxxxxx> wrote: >> I've been working form the netfront.c in the testing tree and using SLES >> 10 RC1 for i386 on a SMP box. When I stress the network using iperf in >> a domU, domU acting as client on a gigabit network, I occasionally get a >> panic at the dev_kfree_skb_irq(skb); line. This is the same panic as >> reported in >> http://lists.xensource.com/archives/html/xen- devel/2006- 05/msg00919.html >> >> The trace indicates that the skb is bad and it looks like the skb is >> an id. Investigating further, the condition occurs if the >> gnttab_query_foreign_access returns non zero on a second or latter >> iteration through the for loop. If it return non zero, the the code >> takes the 'goto out' which by passes fixing up np- >tx.rsp_cons. Then >> the next time in network_tx_buf_gc we reuse np- >tx.rsp_cons which is at >> the location of a previously completed skb and the skb gets an id and >> not a skb. >> >> Looking at the unstable tree, the goto has been removed and replaced >> with a break. However, it looks like if gnttab_query_foreign_access >> returns non zero between np- >tx.rsp_cons and prod, then the >> np- >tx.rsp_cons = prod; could advance np- >tx.rsp_cons too far causing >> other problems latter (I have not tested this yet though). > > Yes, this definitely looks like a bug; the 'break' in - unstable is not > really much better > than the 'goto out:' in - testing since in either case we can't easily > correctly recover. > >> The problem I'm having is that I can't find the root cause as to why >> gnttab_query_foreign_access returns an 8 (GTF_reading?) and not 0. I've >> looked in netback.c and and xen/common/grant_table.c and am not seeing >> it (not that it's not there). > > Well all this means is that netback is still using the grant which should of > > course > be impossible since the ring pointers have been advanced. I.e. something is > borked. > > Can you try this with a debug build of xen? It would be interesting to see > if xen > complains about any grant refs prior to this occurance... > > > cheers, > > S. Here's the serial output from a debug build of xen. The domain_crash does not happen on the non-debug xen. . . . (XEN) (file=memory.c, line=64) Could not allocate order=0 extent: id=0 flags=0 (61 of 64) (XEN) (file=memory.c, line=64) Could not allocate order=0 extent: id=0 flags=0 (59 of 64) (XEN) (file=memory.c, line=64) Could not allocate order=0 extent: id=0 flags=0 (61 of 64) (XEN) (file=memory.c, line=64) Could not allocate order=0 extent: id=0 flags=0 (58 of 64) (XEN) (file=memory.c, line=64) Could not allocate order=0 extent: id=0 flags=0 (61 of 64) (XEN) (file=memory.c, line=64) Could not allocate order=0 extent: id=0 flags=0 (60 of 64) (XEN) DOM0: (file=mm.c, line=2449) PTE entry 0 for address f2c81000 doesn't match frame 7a568 (XEN) DOM0: (file=mm.c, line=637) Attempt to implicitly unmap a granted PTE 4b2fe861 (XEN) domain_crash called from mm.c:638 (XEN) Domain 0 (vcpu#1) crashed on cpu#1: (XEN) ----[ Xen-3.0.2_09668-0.1 Not tainted ]---- (XEN) CPU: 1 (XEN) EIP: 0061:[<c0101287>] (XEN) EFLAGS: 00200212 CONTEXT: guest (XEN) eax: 00000014 ebx: 00000000 ecx: f4c312c0 edx: 00000001 (XEN) esi: f2881f34 edi: f4c2eb9c ebp: f364e408 esp: f2881ee4 (XEN) cr0: 80050033 cr3: 79a7d000 (XEN) ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0069 cs: 0061 (XEN) Guest stack trace from esp=f2881ee4: (XEN) f4c24761 f37a2380 f37a2000 f3f02180 f37a2000 f3f02180 c1658c20 f2881f28 (XEN) c014a91a 00483f02 c032c900 00000001 00000001 00915700 c0c78380 00000000 (XEN) 00483f01 00000027 0003013e 05ea0020 f4c28c40 f2880000 00000000 c03acd10 (XEN) c0123a41 00000001 c036e128 f2880000 c03ab180 c0123555 c03ade60 00000007 (XEN) 00000001 f2880000 00000001 fbdf7000 00000020 c0123665 00000013 f2881fbc (XEN) c01068cc 00000000 c017b610 00000000 00000000 c024d5b1 00000020 00000000 (XEN) b7b8d8d9 08315c88 bfdd38ac bfdd3848 c0105138 f2881fbc b7b8d8d9 00800d4a (XEN) bfdd3b40 08315c88 bfdd38ac bfdd3848 08315c88 0000007b 0000007b ffffffec (XEN) b7b8bbba 00000073 00200286 bfdd37bc 0000007b 00000008 0000240b (XEN) Domain 0 crashed: rebooting machine in 5 seconds. On a non-debug xen I see destroy_grant_host_mapping failing with rc = 0xffffffff, domain 2, ref 0x20, flags 6 in __gnttab_unmap_grant_ref in xen/common/grant_table.c. Also in netback.c in net_tx_action_dealloc, the HYPERVISOR_grant_table_op call succeeds but if you look at the status of each of the gnttab_unmap_grant_ref_t entries there is one with 0xffffffff. > > > _______________________________________________ > Xen- devel mailing list > Xen- devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen- devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |