[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: David Edmondson <dme@xxxxxxx>
  • Date: Tue, 27 Nov 2007 09:26:53 +0000
  • Delivery-date: Tue, 27 Nov 2007 01:27:54 -0800
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

In testing our implementation of the hypervisor copy based backend- >frontend networking changes, we see what I believe are spurious warning messages during the shutdown of the frontend domain:

(XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100, caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN)    [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN)    [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN)    [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN)    [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN)    [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)
(XEN) Guest stack trace from rbp=ffff830000ff3cf8:
(XEN) ???????????????? <G><2>grant_table.c:990:d0 do_grant_table_op: domain 0, cmd 5, count 1 (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 189:d0 Error pfn 30fc2a: rd=ffff830000fcf100, od=0000000000000000, caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN)    [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN)    [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN)    [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN)    [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN)    [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)

What we think is happening is that the frontend dies and most of its' pages are freed (those that are not referenced by another domain). The backend doesn't know that the frontend died yet, so it's still trying to pass packets along to it. It has the rx ring mapped (meaning that it can't be freed) and reads previously advertised grant references from it. Those grants now refer to pages that are no longer valid, so get_page() complains (the pages are no longer valid as only the frontend had references to them and they were freed).

__gnttab_copy() itself seems prepared for this situation, as failures to grab the target page due to a dying domain are correctly handled:

if ( !get_page_and_type(mfn_to_page(d_frame), dd, PGT_writable_page) )
    {
        if ( !test_bit(_DOMF_dying, &dd->domain_flags) )
gdprintk(XENLOG_WARNING, "Could not get dst frame %lx \n", d_frame);
        rc = GNTST_general_error;
        goto error_out;
    }

In our testing we believe that we're following this path (_DOMF_dying is set and rc == GNTST_general_error) and that we handle the failure correctly.

The corresponding failure mode in the page flip code path doesn't result in any INFO warnings. Should they exist in this case?

dme.
--
David Edmondson, Solaris Engineering, http://dme.org



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.