[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list

On 14/11/14 14:14, Jürgen Groß wrote:
> On 11/14/2014 02:56 PM, Andrew Cooper wrote:
>> On 14/11/14 12:53, Juergen Gross wrote:
>>> On 11/14/2014 12:41 PM, Andrew Cooper wrote:
>>>> On 14/11/14 09:37, Juergen Gross wrote:
>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
>>>>> currently contains the mfn of the top level page frame of the 3 level
>>>>> p2m tree, which is used by the Xen tools during saving and restoring
>>>>> (and live migration) of pv domains and for crash dump analysis. With
>>>>> three levels of the p2m tree it is possible to support up to 512
>>>>> GB of
>>>>> RAM for a 64 bit pv domain.
>>>>> A 32 bit pv domain can support more, as each memory page can hold
>>>>> 1024
>>>>> instead of 512 entries, leading to a limit of 4 TB.
>>>>> To be able to support more RAM on x86-64 switch to a virtual mapped
>>>>> p2m list.
>>>>> This patch expands struct arch_shared_info with a new p2m list
>>>>> virtual
>>>>> address and the mfn of the page table root. The new information is
>>>>> indicated by the domain to be valid by storing ~0UL into
>>>>> pfn_to_mfn_frame_list_list. The hypervisor indicates usability of
>>>>> this
>>>>> feature by a new flag XENFEAT_virtual_p2m.
>>>> How do you envisage this being used?  Are you expecting the tools
>>>> to do
>>>> manual pagetable walks using xc_map_foreign_xxx() ?
>>> Yes. Not very different compared to today's mapping via the 3 level
>>> p2m tree. Just another entry format, 4 instead of 3 levels and starting
>>> at an offset.
>> Yes - David and I were discussing this over lunch, and it is not
>> actually very different.
>> In reality, how likely is it that the pages backing this virtual linear
>> array change?
> Very unlikely, I think. But not impossible.
>> One issue currently is that, during the live part of migration, the
>> toolstack has no way of working out whether the structure of the p2m has
>> changed (intermediate leaves rearranged, or the length increasing).
>> In the case that the VM does change the structure of the p2m under the
>> feet of the toolstack, migration will either blow up in a non-subtle way
>> with a p2m/m2p mismatch, or in a subtle way with the receiving side
>> copying the new p2m over the wrong part of the new domain.
>> I am wondering whether, with this new p2m method, we can take sufficient
>> steps to be able to guarantee mishaps like this can't occur.
> This should be easy: I could add a counter in arch_shared_info which is
> incremented whenever a p2m mapping is being changed. The toolstack could
> compare the counter values before start and at end of migration and redo
> the migration (or fail) if they are different. In order to avoid races
> I would have to increment the counter before and after changing the
> mapping.

That is insufficient I believe.


* Toolstack walks pagetables and maps the frames containing the linear p2m
* Live migration starts
* VM remaps a frame in the middle of the linear p2m
* Live migration continues, but the toolstack has a stale frame in the
middle of its view of the p2m.

As the p2m is almost never expected to change, I think it might be
better to have a flag the toolstack can set to say "The toolstack is
peeking at your p2m behind your back - you must not change its structure."

Having just thought this through, I think there is also a race condition
between a VM changing an entry in the p2m, and the toolstack doing
verifications of frames being sent.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.