[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxc: Expose the pdpe1gb cpuid flag to guest



Andrew Cooper wrote on 2014-11-18:
> On 18/11/14 10:14, Tim Deegan wrote:
>> At 17:25 +0000 on 17 Nov (1416241517), Andrew Cooper wrote:
>>> On 17/11/14 17:00, Tim Deegan wrote:
>>>> At 16:40 +0000 on 17 Nov (1416238835), Andrew Cooper wrote:
>>>>> On 17/11/14 16:30, Tim Deegan wrote:
>>>>>> At 16:24 +0000 on 17 Nov (1416237888), Jan Beulich wrote:
>>>>>>>>>> On 17.11.14 at 16:39, <Ian.Jackson@xxxxxxxxxxxxx> wrote:
>>>>>>>> Liang Li writes ("[PATCH] libxc: Expose the pdpe1gb cpuid flag
>>>>>>>> to
> guest"):
>>>>>>>>> If hardware support the pdpe1gb flag, expose it to guest by default.
>>>>>>>>> Users don't have to use a 'cpuid= ' option in config file to
>>>>>>>>> turn it on.
>>>>>>>> I don't understand what this flag does.  I guess from the name
>>>>>>>> it turns on 1G pages.  I guess these are supported ?
>>>>>>>> 
>>>>>>>> I would like to see comment from an x86 hypervisor maintainer.
>>>>>>> Yes, we support 1Gb pages. The purpose of the patch is to not
>>>>>>> unconditionally hide the flag from guests. (I had commented on
>>>>>>> v1, but sadly this one wasn't tagged as v2, nor was I included
>>>>>>> on the Cc list, hence I didn't spot the new version.)
>>>>>>> 
>>>>>>> The one thing I'm not certain about is shadow mode: Only 2Mb
>>>>>>> pages may be supported there. Tim?
>>>>>> Indeed, only 2MiB pages are supported in shadow mode.  See, e.g.
>>>>>> 
>>>>>> guest_supports_1G_superpages()->hvm_pse1gb_supported()->paging_mod
>>>>>> e_hap()
>>>>> This is yet another case which proves that libxc is the wrong
>>>>> place to be choosing the cpuid flags exposed to a domain.
>>>>> 
>>>>> Furthermore, guest_supports_1G_superpages() is insufficient as it
>>>>> only checks whether the host is capable of providing 1G
>>>>> superpages, not whether the guest has been permitted to use it.
>>>>> 
>>>>> This causes a problem when migrating between hap-capable and
>>>>> hap-incapable systems.
>>>> There's no notion of 'permitted' to use 1G pages, AFAICS, much
>>>> like other CPU features.  But of course a _well-behaved_ guest
>>>> that has been told in cpuid not to use 1G superpages will have no problems.
>>>> :)
>>> That is my point.
>>> 
>>> If 1GB pages are not supported (i.e. not present in cpuid), then
>>> the PS bit in an L3 is reserved, must be 0, and cause a pagefault if used.
>>> 
>>> Nothing in Xen currently enforces this because
>>> guest_supports_1G_superpages() doesn't check domain_cpuid().
>> For shadow mode, Xen DTRT by checking hvm_pse1gb_supported() in the
>> HVM cpuid handler, so we don't advertise a feature we can't support.
>> 
>> For HAP mode, we _could_ add a cpuid check to the pagetable walker
>> but...
>> 
>>> It does however make me wonder how VMX/SVM non-root mode would
>>> enforce this as it would see the host cpuid, not guest cpuid when
>>> performing paging internally.
>> ...non-emulated PT walks will get to use 1GB superpages anyway.
>> This is the same for other features (new instructions &c).  We can
>> mask them out of CPUID but by and large we can't make them fault.

Agree. I will forward this question to internally to see whether they aware of 
this problem.

> 
> Hmm - this is a pitfall waiting to happen.
> 
> In the case that there is a heterogeneous setup with one 1G capable
> and one 1G incapable server, Xen cannot forcibly prevent the use of 1G
> pages on the capable hardware.  Any VM which guesses at hardware
> support by means other than cpuid features is liable to explode on migrate.

But a normal guest shouldn't to guess it, right?

> 
> I suspect this will just have to fall into the category of "here be
> yet more dragons with heterogeneous migration"
> 
> ~Andrew


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.