[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Rebooting domu fails in nfs share exported from another domu on the same dom0



Hi

I hit a problem in such scenario: vm1 is running and export nfs service, dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2 are running on the same dom0.

When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback-> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback -> vm1 netfront.

In above data flow, nfs implements direct io, blkfront and blkback uses grantmap. This makes page mapping works well through vm2 blkfront to vm1 netback. However, when netback does grant copy, the error happens in this routine: __gnttab_copy->__get_paged_frame->get_page_from_gfn->get_page.
See /xen/arch/x86/mm.c get_page(),
    if ( likely(owner == domain) )
        return 1;
In above if condition, the src page is from vm2, so owner is id of vm2, domain is 0 here. Then get_page return 0, hence get_page_from_gfn return NULL and __get_paged_frame return GNTST_bad_page. Finally, put_page is called in __grant_copy directly and grant copy fails in netback. As a result, writing to nfsfile fails and this results damage to nfsfile, then vm can not be rebooted successfully.

Disable the nfs direct io can be a workaround, however, this will cause performance penalty. Or any copy is involved between vm2 blkfront->vm1 netback probably helps in this case. But zerocopy is the best thing for performance, so any suggestions for this issue?

This issue is pretty similar with this one http://lists.xen.org/archives/html/xen-devel/2012-12/msg01722.html. Roger, did you fix this issue in your case?

Thanks
Annie

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.