Re: [Xen-devel] [PATCH 3/3] xen: eliminate scalability issues from initial mapping setup

Direct Xen to place the initial P->M table outside of the initial
mapping, as otherwise the 1G (implementation) / 2G (theoretical)
restriction on the size of the initial mapping limits the amount
of memory a domain can be handed initially.
The three level p2m limits memory to 512 GiB on x86-64 but this patch
doesn't seem to address this limit and thus seems a bit useless to
Any increase of the p2m beyond 3 levels will need to come with
substantial libxc changes first.  3 level p2ms are hard coded
all the PV build and migrate code.
No, there no such dependency - the kernel could use 4 levels at
any time (sacrificing being able to get migrated), making sure it
only exposes the 3 levels hanging off the fourth level (or not
exposing this information at all) to external entities making this
wrong assumption.


That would require that the PV kernel must start with a 3 level p2m and
fudge things afterwards.

I always thought the 3 level p2m is constructed by the kernel, not by
the tools.

It starts with the linear p2m list anchored at xen_start_info->mfn_list,
constructs the p2m tree and writes the p2m_top_mfn mfn to

See comment in the kernel source arch/x86/xen/p2m.c

So booting with a larger p2m list can be handled completely by the
kernel itself.

Ah yes - I remember now.  All the toolstack does is create the linear
p2m.  In which case building such a domain will be fine.

At a minimum, I would expect a patch to libxc to detect a 4 level PV
guest and fail with a meaningful error, rather than an obscure "m2p
doesn't match p2m for mfn/pfn X".

I'd rather fix it in a clean way.

I think the best way to do it would be an indicator in the p2m array
anchor, e.g. setting 1<<61 in pfn_to_mfn_frame_list_list. This will
result in an early error with old tools:
"Couldn't map p2m_frame_list_list"

No it wont.  The is_mapped() macro in the toolstack is quite broken.  It
stems from a lack of Design/API/ABI concerning things like the p2m.  In
particular, INVALID_MFN is not an ABI constant, nor is any notion of
mapped vs unmapped.

That's not relevant here. map_frame_list_list() in xc_domain_save.c
reads pfn_to_mfn_frame_list_list and tries to map that mfn directly.
This will fail and result in above error message.

Its current implementation is a relic of 32bit days, and only checks bit
31.  It also means that it is impossible to migrate a PV VM with pfns
above the 43bit limit; a restriction which is lifted by my migration v2
series.  A lot of the other migration constructs are in a similar state,
which is why they are being deleted by the v2 series.

The clean way to fix this is to leave pfn_to_mfn_frame_list_list as
INVALID_MFN. Introduce two new fields beside it named p2m_levels and
p2m_root, which then caters for levels greater than 4 in a compatible

I don't mind doing it this way.


