Re: [Xen-devel] [PATCH 1/4] expand x86 arch_shared_info to support linear p2m list

On 11/14/2014 02:56 PM, Andrew Cooper wrote:
On 14/11/14 12:53, Juergen Gross wrote:
On 11/14/2014 12:41 PM, Andrew Cooper wrote:
On 14/11/14 09:37, Juergen Gross wrote:
The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
currently contains the mfn of the top level page frame of the 3 level
p2m tree, which is used by the Xen tools during saving and restoring
(and live migration) of pv domains and for crash dump analysis. With
three levels of the p2m tree it is possible to support up to 512 GB of
RAM for a 64 bit pv domain.

A 32 bit pv domain can support more, as each memory page can hold 1024
instead of 512 entries, leading to a limit of 4 TB.

To be able to support more RAM on x86-64 switch to a virtual mapped
p2m list.

This patch expands struct arch_shared_info with a new p2m list virtual
address and the mfn of the page table root. The new information is
indicated by the domain to be valid by storing ~0UL into
pfn_to_mfn_frame_list_list. The hypervisor indicates usability of this
feature by a new flag XENFEAT_virtual_p2m.

How do you envisage this being used?  Are you expecting the tools to do
manual pagetable walks using xc_map_foreign_xxx() ?

Yes. Not very different compared to today's mapping via the 3 level
p2m tree. Just another entry format, 4 instead of 3 levels and starting
at an offset.

Yes - David and I were discussing this over lunch, and it is not
actually very different.

In reality, how likely is it that the pages backing this virtual linear
array change?

Very unlikely, I think. But not impossible.

One issue currently is that, during the live part of migration, the
toolstack has no way of working out whether the structure of the p2m has
changed (intermediate leaves rearranged, or the length increasing).

In the case that the VM does change the structure of the p2m under the
feet of the toolstack, migration will either blow up in a non-subtle way
with a p2m/m2p mismatch, or in a subtle way with the receiving side
copying the new p2m over the wrong part of the new domain.

I am wondering whether, with this new p2m method, we can take sufficient
steps to be able to guarantee mishaps like this can't occur.

This should be easy: I could add a counter in arch_shared_info which is
incremented whenever a p2m mapping is being changed. The toolstack could
compare the counter values before start and at end of migration and redo
the migration (or fail) if they are different. In order to avoid races
I would have to increment the counter before and after changing the


