[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0



On 06/11/09 02:02, Ian Campbell wrote:
On Tue, 2009-06-09 at 13:28 -0400, Jeremy Fitzhardinge wrote:
Ian Campbell wrote:
I wonder how this interacts with the logic in
arch/x86/xen/mmu.c:xen_pin_page() which holds the lock while waiting for
the (deferred) pin multicall to occur? Hmm, no this is about the
PagePinned flag on the struct page which is out of date WRT the actual
pinned status as Xen sees it -- we update the PagePinned flag early in
xen_pin_page() long before Xen the pin hypercall so this window is the
other way round to what would be needed to trigger this bug.

Yes, it looks like you could get a bad mapping here.  An obvious fix
would be to defer clearing the pinned flag in the page struct until
after the hypercall has issued.  That would make the racy
kmap_atomic_pte map RO, which would be fine unless it actually tries to
modify it (but I can't imagine it would do that unlocked).

But would it redo the mapping after taking the lock? It doesn't look
like it does (why would it). So we could end up writing to an unpinned
pte via a R/O mapping.

Hm, yep. One thing I noticed is that set_pte() is used very rarely, so it would be no cost to always use a hypercall in that case. But xen_set_pte_at() ends up calling xen_set_pte() as well, and I think that's more common. Certainly we need to make sure that we're actually taking advantage of late-pin by direct writing unpinned ptes.

I've been thinking of rearranging the set_pte(_at) pvops a little bit anyway; its not obvious we're really getting much benefit from using the update_va_mapping hypercall, and if we're not using it, then the set_pte_at pvop is taking a lot of unused parameters.

If we switch to just using mmu_update, then we can just pass the address and pte value. But we could also pass the struct page * (which makes a bit of conceptual sense), so we could easy directly test whether the pte is pinned, and either use a direct write or hypercall accordingly.

As an experiment I tried the simple approach of flushing the multicalls
explicitly in xen_unpin_page and then clearing the Pinned bit and it all
goes a bit wrong. eip is "ptep->pte_low = 0" so I think the unpinned but
R/O theory holds...

Yes, I think the theory is sound. But I'm curious why Pasi seems to be able to hit the race easily, but we have not...

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.