Xen project Mailing List

Re: [Xen-devel] [PATCH v2 05/30] xen/public: Export cpu featureset information in the public API

To: Joao Martins <joao.m.martins@xxxxxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Sat, 20 Feb 2016 19:17:35 +0000

Cc: Tim Deegan <tim@xxxxxxx>, Ian Campbell <Ian.Campbell@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Sat, 20 Feb 2016 19:18:06 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 20/02/16 17:39, Joao Martins wrote: > >>>>> and given that this >>>>> is exposed on both sysctl and libxl (through libxl_hwcap) shouldn't its >>>>> size >>>>> match the real one (boot_cpu_data.x86_capability) i.e. NCAPINTS ? >>>>> Additionally I >>>>> see that libxl_hwcap is also hardcoded to 8 alongside struct >>>>> xen_sysctl_physinfo >>>>> when it should be 10 ? >>>> Hardcoding of the size in sysctl can be worked around. Fixing libxl is >>>> harder. >>>> >>>> The synthetic leaves are internal and should not be exposed. >>>> >>>>> libxl users could potentially make use of this hwcap field to see what >>>>> features >>>>> the host CPU supports. >>>> The purpose of the new featureset interface is to have stable object >>>> which can be used by higher level toolstacks. >>>> >>>> This is done by pretending that hw_caps never existed, and replacing it >>>> wholesale with a bitmap, (specified as variable length and safe to >>>> zero-extend), with an ABI in the public header files detailing what each >>>> bit means. >>> Given that you introduce a new API for libxc (xc_get_cpu_featureset()) >>> perhaps >>> an equivalent to libxl could also be added? That wat users of libxl could >>> also >>> query about the host and guests supported features. I would be happy to >>> produce >>> patches towards that. >> In principle, this is fine. Part of this is covered by the xen-cpuid >> utility in a later patch. >> > OK. > >> Despite my plans to further rework guest cpuid handling, the principle >> of the {raw,host,pv,hvm}_featuresets is expected to stay, and be usable >> in their current form. > That's great to hear. The reason I brought this up is because libvirt has the > idea of cpu model and features associated with it (similar to qemu -cpu > XXX,+feature,-feature stuff but in an hypervisor agnostic manner that other > architectures can also use). libvirt could do mostly everything on its own, > but > it still needs to know what the host supports. Based on that it then > calculates > the lowest common denominator of cpu features to be enabled or masked out for > guests when comparing to an older family in a pool of servers. Though PV/HVM > (with{,out} hap/shadow) have different feature sets as you mention. So libvirt > might be thrown into error since a certain feature isn't sure to be set/masked > for a certain type of guest. So knowing those (i.e {pv,hvm,...}_featuresets in > advance lets libxl users make more reliable usage of the libxl cpuid policies > to > more correctly normalize the cpuid for each type of guest. Does libvirt currently use hw_caps (and my series will inadvertently break it), or are you looking to do some new work for future benefit? Sadly, cpuid levelling is a quagmire and not as simple as just choosing the common subset of bits. When I started this project I was expecting it to be bad, but nothing like as bad as it has turned out to be. As an example, the "deprecates fcs/fds" bit which is the subject of the "inverted" mask. The meaning of the bit is "hardware no longer supports x87 fcs/fds, and they are hardwired to zero". Originally, the point of the inverted mask was to make a "featureset" which could be levelled sensibly without specific knowledge of the meaning of each bit. This property is important for forwards compatibility, and avoiding unnecessary complexity in higher level toolstack components. However, with hindsight, attempting to level this bit is pointless. It is a statement about a change in pre-existing behaviour of an element of the cpu pipeline, and the pipeline behaviour will not change depending on how the bit is advertised to the guest. Another bit, "fdp exception only" is in a similar bucket. Other issues, which I haven't even tried to tackle in this series, are items such as the MXCSR mask. The real value cannot be levelled, is expected to remain constant after boot, and liable to induce #GP faults on fxrstor if it changes. Alternatively, there is EFER.LMSLE (long mode segment limit enable) which doesn't even have a feature bit to indicate availability (not that I can plausibly see an OS actually turning that feature on). A toolstack needs to handles all of: * The maximum "configuration" available to a guest on the available servers. * Which bits of that can be controlled, and which will simply leak through. * What the guest actually saw when it booted. (I use configuration here to include items such as max leaf, max phys addr, etc which are important to be levelled, but not included in the plain feature bits in cpuid). My longterm plans involve: * Having Xen construct a full "maximum" cpuid policy, rather than just a featureset. * Per-domain cpuid policy, seeded from maximum on domain_create, and modified where appropriate (e.g. hap vs shadow, PV guest switching between native and compat mode). * All validity checking for updates in the set_cpuid hypercall rather than being deferred to the cpuid intercept point. * A get_cpuid hypercall so a toolstack can actually retrieve the policy a guest will see. Even further work involves: * Put all this information into the migration stream, rather than having it regenerated by the destination toolstack. * MSR levelling. But that is a huge quantity more work, which is why this series focuses just on the featureset alone, in the hope that the featureset it still a useful discrete item outside the context of a full cpuid policy. I guess my question at the end of all this is what libvirt currently handles of all of this? We certainly can wire the featureset information through libxl, but it is insufficient in the general case for making migration safe. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.