[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch V2] expand x86 arch_shared_info to support linear p2m list



>>> On 01.12.14 at 15:33, <JGross@xxxxxxxx> wrote:
> On 12/01/2014 02:37 PM, Jan Beulich wrote:
>>>>> On 01.12.14 at 14:11, <JGross@xxxxxxxx> wrote:
>>> On 12/01/2014 12:29 PM, Jan Beulich wrote:
>>>>>>> On 01.12.14 at 12:19, <david.vrabel@xxxxxxxxxx> wrote:
>>>>> On 01/12/14 10:15, Jan Beulich wrote:
>>>>>>>>> On 01.12.14 at 10:29, <JGross@xxxxxxxx> wrote:
>>>>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
>>>>>>> currently contains the mfn of the top level page frame of the 3 level
>>>>>>> p2m tree, which is used by the Xen tools during saving and restoring
>>>>>>> (and live migration) of pv domains and for crash dump analysis. With
>>>>>>> three levels of the p2m tree it is possible to support up to 512 GB of
>>>>>>> RAM for a 64 bit pv domain.
>>>>>>>
>>>>>>> A 32 bit pv domain can support more, as each memory page can hold 1024
>>>>>>> instead of 512 entries, leading to a limit of 4 TB.
>>>>>>>
>>>>>>> To be able to support more RAM on x86-64 switch to a virtual mapped
>>>>>>> p2m list.
>>>>>>>
>>>>>>> This patch expands struct arch_shared_info with a new p2m list virtual
>>>>>>> address, the root of the page table root and a p2m generation count.
>>>>>>> The new information is indicated by the domain to be valid by storing
>>>>>>> ~0UL into pfn_to_mfn_frame_list_list. The hypervisor indicates
>>>>>>> usability of this feature by a new flag XENFEAT_virtual_p2m.
>>>>>>>
>>>>>>> Right now XENFEAT_virtual_p2m will not be set. This will change when
>>>>>>> the Xen tools support the virtual mapped p2m list.
>>>>>>
>>>>>> This seems wrong: XENFEAT_* only reflect hypervisor capabilities.
>>>>>> I.e. the availability of the new functionality may need to be
>>>>>> advertised another way - xenstore perhaps?
>>>>>
>>>>> Xenstore doesn't work for dom0.
>>>>>
>>>>> Shouldn't this be something the guest kernel reports using a ELF note bit?
>>>>>
>>>>> When building a domain (either in Xen for dom0 or in the tools), the
>>>>> builder may provide a linear p2m iff supported by the guest kernel and
>>>>> then (and only then) can it provide a guest with > 512 GiB.
>>>>
>>>> Yes, surely this flag could act as a kernel capability indicator (via
>>>> the XEN_ELFNOTE_SUPPORTED_FEATURES note), like e.g.
>>>> XENFEAT_dom0 already does. JÃrgen's final statement, however,
>>>> suggested to me that this is meant to be only consumed by kernels.
>>>
>>> Yes. The p2m list built by the domain builder is already linear. It may
>>> just be to small to hold all entries required e.g. for Dom0.
>>>
>>> It's Xen-tools and kdump which have to deal with the linear p2m list.
>>> So the guest kernel has to be told if it is allowed to present the
>>> linear list instead of the 3-level tree at pfn_to_mfn_frame_list_list.
>>>
>>> As this is true for Dom0 as well, this information must be given by the
>>> hypervisor.
>>>
>>> I'm aware that XENFEAT_* is only used for hypervisor capabilities up to
>>> now. As the Xen tools are tightly coupled to the hypervisor I don't see
>>> why the features can't express the capability of the complete Xen
>>> installation instead. Would you prefer introducing another leaf for
>>> that purpose (submap.idx == 1) ?
>>
>> That wouldn't change the odd situation of reporting a capability of
>> another component. That's even more of a problem for the Dom0
>> case, where the affected tool (kdump) isn't even under our control
>> (and shouldn't be).
>>
>> But in the end - what's wrong with always (or conditionally upon a
>> CONFIG_* option and/or command line parameter and/or memory
>> size) filling both the old and new shared info fields? A capable tool
>> can determine whether the new one is valid, and an incapable tool
>> won't work on huge memory configs anyway.
> 
> Okay, but this would require another way of reporting the validity of
> the linear p2m list anchor, as setting pfn_to_mfn_frame_list_list to
> an invalid value is no longer an option then.
> 
> As the shared info page is always zeroed when the domain is built we
> could use a value different from 0 of e.g. the p2m_generation member
> as an indicator for the validity.

Wouldn't both of the other new fields be guaranteed non-zero
when used?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.