[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] a few questions about superpage support



On Mon, 2013-09-09 at 15:04 -0700, Antonin Bas wrote:
> Hi,
> 
> First of all, thank you very much for your help. Please see inline comments.
> 
> 2013/9/9 Ian Campbell <Ian.Campbell@xxxxxxxxxx>:
> > On Fri, 2013-09-06 at 16:52 -0700, Antonin Bas wrote:
> >> Hi,
> >>
> >> I am working on a project that relies on superpages within a guest. Of
> >> course these superpages need to be backed by actual machine pages.
> >
> > Which type of guest are you running? Mosto f my reply is specific to HVM
> > which is what was implied by your interest in HAP.
> 
> I am indeed running HVM guests.
> 
> >
> > There is no inherent need for things which are mapped as superpages in
> > the guest pagetables be also mapped as superpages in the p2m (e.g. HAP)
> > mappings. It is fine for a 2MB guest mapping to be translated via a
> > block of 4K mappings in the p2m (and vice versa).
> >
> > Unless perhaps you mean that your usecase adds an additional
> > requirement?
> 
> Thanks. I have a much better idea of what's actually going on now. In
> my use case, I run a process which makes extensive use of a 1GB memory
> region (with memory accesses randomly distributed over that region). I
> was hoping that by using a 1GB hugepage in the guest for that process
> and having this 1GB page mapped to an actual 1GB physical block, I
> would avoid cache misses (it is my understanding that there are some
> cache lines in the TLB reserved for 1GB mappings, both for gva -> gpa
> and gpa -> ma).
> But from what you are saying, it seems that there is no way to
> guarantee that the guest 1GB hugepage will be translated via a 1GB
> mapping in the p2m.

I'm not all that familiar with the internals but I think not. At least
not without modifying Xen to make it true.
> 
> >
> >>
> >> I am using this version of Xen:
> >> (XEN) Xen version 4.2.2_04-0.7.5 (abuild@) (gcc (SUSE Linux) 4.3.4
> >> [gcc-4_3-branch revision 152973]) Fri Jun 14 12:22:34 UTC 2013
> >>
> >>
> >> HAP is enabled:
> >> ...
> >> (XEN) VMX: Supported advanced features:
> >> (XEN)  - APIC MMIO access virtualisation
> >> (XEN)  - APIC TPR shadow
> >> (XEN)  - Extended Page Tables (EPT)
> >> (XEN)  - Virtual-Processor Identifiers (VPID)
> >> (XEN)  - Virtual NMI
> >> (XEN)  - MSR direct-access bitmap
> >> (XEN)  - Unrestricted Guest
> >> (XEN)  - APIC Register Virtualization
> >> (XEN)  - Virtual Interrupt Delivery
> >> (XEN) HVM: ASIDs enabled.
> >> (XEN) HVM: VMX enabled
> >> (XEN) HVM: Hardware Assisted Paging (HAP) detected
> >> (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> >> ...
> >>
> >> I am  using the boot line option allowsuperpage=1 and the guest config
> >> includes superpages=1
> >
> > Is the former not a PV guest only thing?
> >
> > And I can't see any code about the latter in the xl toolstack.
> >
> > I thought superpages were the default, if any are available, for HVM
> > guests.
> 
> You are right about superpages, I don't think this one is used.
> For allowsuperpage, the doc says nothing about it being used only for
> PV guests, and I think it is used for all guetss. The default value is
> true anyway. The only reference to it is in get_page_from_l2e() in
> x86/mm.c.

get_page_from_l2e is a PV only function, I think. The option was added
by bd1cd81d6484 "x86: PV support for hugepages". I suspect the docs are
just wrong.

The default in the code appears to be false, so I suspect the code is
doubly wrong...

> >
> >> With a guest memory of 4096, I can observe EPT entries that look like this:
> >> (XEN) gfn: 10f600            mfn: 306c00            order:  9  is_pod: 0
> >> At first I though they meant that guest 2M superpages were indeed
> >> being backed by 2M host machine superpages. I though this was weird
> >> since I could observe these entries even without explicitly requesting
> >> hugepages from within the guest. I set transparent hugepages in the
> >> guest to never (seems to be enabled by default in SUSE) but I could
> >> still observe these 'order: 9' entries, which means I don't actually
> >> know what they represent.
> >
> > As I say above, the guest and p2m use of superpage mappings are
> > independent with HAP. And p2m superpages are the default for HVM.
> 
> Ok. One more question though. How does the VMM decides on the number
> of 1GB mapping and 2M mappings to use?

I'm not sure but I think based on availability of such pages to allocate
and alignment of the RAM within the guest, accounting for holes etc.

> When I boot a 4GB guest, I get the following mappings:
> 
> xc: info: PHYSICAL MEMORY ALLOCATION:
>   4KB PAGES: 0x0000000000000200
>   2MB PAGES: 0x00000000000003fb
>   1GB PAGES: 0x0000000000000002
> 
> I am running a 32GB machine (2 NUMA nodes, each with an Ivy Bridge CPU
> and 16GB memory, HT enabled), and I have allocated 8GB of memory to
> dom0. This is the first guest I am starting, so I probably still have
> a lot of contiguous free memory. Why not use 3 1G superpages or even
> 4?

I expect there are MMIO holes under 1MB and between 3GB-4GB which
prevent the use of 1GB mapping, you can probably get a sense of that
from the e820 presented to the guest.

> 
> >
> >> 2) 1GB superpage support. When I try to request 1GB in the guest at
> >> boot time, I get the following message from the kernel: "hugepagesz:
> >> Unsupported page size 1024 M", which is not a surprise since the
> >> pdpe1gb cpu flag is not enabled. How can I enable this flag for the
> >> domU vcpus? If this flag can be enabled, will the VMM try to map my
> >> guest 1GB superpages to host physical 1GB hugepage in the EPT?
> >
> > Does your physical CPU support this?
> >
> > The toolstacks have options for controlling the masking of guest visible
> > CPUID values. I'd be surprised if this particular wasn't passed through
> > to guests by default.
> 
> Yes, my IvyBridge CPU supports pdpe1gb. I read here
> (http://www.gossamer-threads.com/lists/xen/devel/273636) that some
> flags were masked off by default -even when supported by the physical
> CPU- because they threaten live migration. However I cannot find where
> this happens in the tools code (Xen 4.2.2).

Looking at xen/include/asm-x86/cpufeature.h Xen's symbolic name for this
flag appears to be X86_FEATURE_PAGE1GB. Grep finds a few uses in the
tools and in Xen itself, most of them are PV related.

There is one relating to the hypervisor in hvm_cpuid the clearing is
conditional on hvm_pse1gb_supported, which is conditional on the
presence of the HVM_HAP_SUPERPAGE_1GB capability. IIRC your logs said
that was present and above the memory layout shows 2 1GB pages getting
used.

Given that I don't know why this isn't exposed to the guest. Might be
worth instrumenting things up?

Ian.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.