[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/PV: make post-migration page state consistent



On 11.09.2020 14:37, Jan Beulich wrote:
> On 11.09.2020 13:55, Andrew Cooper wrote:
>> On 11/09/2020 11:34, Jan Beulich wrote:
>>> When a page table page gets de-validated, its type reference count drops
>>> to zero (and PGT_validated gets cleared), but its type remains intact.
>>> XEN_DOMCTL_getpageframeinfo3, therefore, so far reported prior usage for
>>> such pages. An intermediate write to such a page via e.g.
>>> MMU_NORMAL_PT_UPDATE, however, would transition the page's type to
>>> PGT_writable_page, thus altering what XEN_DOMCTL_getpageframeinfo3 would
>>> return. In libxc the decision which pages to normalize / localize
>>> depends solely on the type returned from the domctl. As a result without
>>> further precautions the guest won't be able to tell whether such a page
>>> has had its (apparent) PTE entries transitioned to the new MFNs.
>>
>> I'm afraid I don't follow what the problem is.
>>
>> Yes - unvalidated pages probably ought to be consistently NOTAB, so this
>> is probably a good change, but I don't see how it impacts the migration
>> logic.
> 
> It's not the migration logic itself that's impacted, but the state
> of guest pages after migration. I'm afraid I can only try to expand
> on the original description.
> 
> Case 1: Once an Ln page has been unvalidated, due to the described
> behavior the migration code in libxc will normalize and then localize
> it. Therefore the guest could go and directly try to use it as a
> page table again. This should work as long as all of the entries in
> the page can still be successfully validated (i.e. unless the guest
> itself has made changes to the state of other pages).
> 
> Case 2: Once an Ln page has been unvalidated, the guest for whatever
> reason still writes to it through e.g. MMU_NORMAL_PT_UPDATE. Prior
> to migration, and provided the new entry can be validated (and no
> other reference page has changed state), the page can still be
> converted to a proper page table one again. If, however, migration
> occurs inbetween, the page now won't get normalized and then
> localized. The MFNs in it are unlikely to make sense anymore, and
> hence an attempt to make the page a page table again is likely to
> fail (or if it doesn't fail the result is unlikely to be what's
> intended).
> 
> Since there's no way to make case 2 "work", the only choice is to
> make case 1 behave like case 2, in order for the behavior to be
> predictable / consistent.
> 
>> We already have to cope with a page really changing types in parallel
>> with the normalise/localise logic (that was a "fun" one to debug), which
>> is why errors in that logic are specifically not fatal while the guest
>> is live - the frame gets re-marked as dirty, and deferred until the next
>> round.
>>
>> Errors encountered after the VM has been paused are fatal.
>>
>> However, at no point, even with an unvalidated pagetable type, can the
>> contents of the page be anything other than legal PTEs.  (I think)
> 
> Correct, because in order to write to the page one has to either
> make it a page table one again (and then write through hypercall
> or for L1 through PTWR) or the mmu-normal-pt-update would first
> convert the page to a writable one.

Besides wanting to ping this change / discussion, I'd also like
to correct myself on this last part of the reply: The above
applies to pages after having got de-validated. However, prior
to validation pages have their type changed early (type ref count
remains zero), but at this point it can't be told yet whether the
page consists of all legal PTEs.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.