[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)

On 21/01/16 12:59, Jan Beulich wrote:
>>>> For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug
>>>> build (so there is a guest observable difference between running on a
>>>> debug and a non-debug Xen), and the comment beside it even identifies
>>>> that it breaks BSD guests.  PTE bits 62:59 used by hardware if  CR4.PKE
>>>> is set.  Currently this means that we are not able to support Protection
>>>> Key for PV guests (although this restriction technically only applies to
>>>> debug builds of the hypervisor).
>>>> The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52).  This bit
>>>> is used to notice when a 64bit PV guest attempts to override the fixup
>>>> Xen applies to its PTEs.  Xen unilaterally sets _PAGE_GLOBAL for user
>>>> pages, and clears _PAGE_GLOBAL for supervisor mappings, setting
>>>> _PAGE_USER in both cases as the PV kernel runs in ring3.  The only thing
>>>> _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately
>>>> tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a
>>>> warning is logged and the kernel overridden.
>>>> Neither of the used PTE bits exist in the Xen public ABI.  Neither of
>>>> them serve a purpose other than a debugging aid.
>>>> I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of
>>>> "all bits available for guest use".
>>> And a kernel using any of the conflicting bits would then become
>>> unusable on a hypervisor with that debug option enabled? I'd
>>> rather see us document the state things are in...
>> _PAGE_GNTMAP is already states:
>> /*
>>  * Debug option: Ensure that granted mappings are not implicitly unmapped.
>>  * WARNING: This will need to be disabled to run OSes that use the spare PTE
>>  * bits themselves (e.g., *BSD).
>>  */
> But that's (assuming the use of the two bits were spelled out) a
> guest OS not fully playing by the spec. To me, "available" PTE bits
> being shared implies that some of them may be claimed by Xen,
> while others may be claimed by guests. You're right that this needs
> to be written down, but I don't think we need to go as far as
> forbidding Xen to use any of them. And even less so should we
> preclude their use for any purpose going forward.

I do not want to make an ABI which mandates the use of certain bits by Xen.

As we see with the Protection Key feature, newer hardware feature start
using bits which were previously software available, and we absolutely
don't want to be in a position where our ABI prevents us from ever
supporting a new feature.

> In the end, with (as it seems) not much effort this could even be
> made dynamic: A guest could advertise which of the bits it doesn't
> use, and then Xen could pick two of them for the two purposes it
> currently needs them for. Should a guest leave no or just one bit
> available, the debugging aid could then be disabled.

This is unnecessarily complicated.  It only helps going forward, and
leaves Xen with substantially more complicated PTE handling.

What happens if at some point later, we try to boot a PV guest which has
a different PTE layout to what Xen has chosen?  We definitely can't
update every PTE with a newly-chosen layout.

>> I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option,
>> disabled by default even in debug builds.
>> There should not be an ABI difference between release and "normal" debug
>> builds.
> Well, I see your point, but as said above I'm not convinced
> disabling all that code is the right solution. In fact, what you
> propose is not far away from removing that code altogether.

The two bits are only used for specialised debugging.  They should be
relegated to people doing specific debugging, and not interfere with the
overwhelming majority of cases where Xen doesn't need to use any
software available PTE bits.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.