[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes for PV mem_access



>>> As Andrew already pointed out, you absolutely need to deal with page
>>> crossing accesses,
>> Is this for say an unsigned long that lives across two pages? Off the top of
>my head, I think always allowing writes to the page in question and the next
>followed by reverting to default for both pages at the end of the write should
>take care of this. I would have to walk the page tables to figure out the next
>mfn. Or am I on the wrong track here?
>
>create_bounce_frame puts several adjacent words on a guest stack, and this
>is very capable of crossing a page boundary.
>
>Even an unaligned uint16_t can cross a page boundary.

OK, so marking two adjacent pages as writable and reverting after the write 
went through should solve this problem.

>>> and I think you also need to deal with hypervisor accesses extending
>>> beyond a page worth of memory (I'm not sure we have a firmly
>>> determined upper bound of how much memory we may copy in one go).
>> Let me try to understand what happens in the non-mem_access case. Say
>the hypervisor is writing to three pages and all of them are not accessible in
>the guest. Which one of the following is true?
>> 1. There is a pagefault for the first page which is resolved. The write is 
>> then
>retried which causes a fault for the second page which is resolved. Then the
>write is retried starting from the second page and so on for the third page 
>too.
>> 2. Or does the write get retried starting from the first page each time the
>page fault is resolved?
>
>For the non-mem_access case, all faults cause failures.
>
>copy_to/from_user() will typically result in an -EFAULT being handed back to
>the hypercaller.  For create_bounce_frame, the results are more severe and
>might result in a domain crash or an injection of a failsafe callback.
>
>No attempt is made to play with the page permissions, as it is the guests fault
>that the pages have the wrong permissions.
>
>What mem_access introduces is a case where it is Xen's fault that a write fault
>occured, and the fault should be worked around as the guest is unaware that
>its pages are actually read-only.

Ouch, this does make things complicated. The only thing I can think of trying 
is your suggestion "Alternatively, locate the page in question and use 
map_domain_page() to get a supervisor rw mapping.". Do this only in 
__copy_to_user_ll() for copies that span multiple pages in the cases where a 
mem_access listener is present and listening for write violations. 

Sigh, if only I could bound the CR0.WP solution :-(

>>>> +                if ( guest_l1e_get_flags(gw.l1e) & _PAGE_RW )
>>>>                  {
>>>> -                    cr0 &= ~X86_CR0_WP;
>>>> -                    write_cr0(cr0);
>>>> -                    v->arch.pv_vcpu.need_cr0_wp_set = 1;
>>>> +                    domain_pause_nosync(d);
>>> I don't think a "nosync" pause is enough here, as that leaves a
>>> window for the guest to write to the page. Since the sync version may
>>> take some time to complete it may become difficult for you to
>>> actually handle this in an acceptable way.
>> Are you worried about performance or is there some other issue?
>
>Both performance and correctness.  With nosync(), guest vcpus can still be
>running on other pcpus, and playing with this pagetable entry.
>
>The synchronous variants can block for a moderate period of time.

OK, I don't follow why pausing the other vcpus synchronously is an issue here. 
But if pausing other guest vcpus synchronously even is not an option then it 
looks like I am at a dead end even if I solve the writes spanning multiple 
pages issues.

Thanks,
Aravindh


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.