[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 12/30] xen/x86: Generate deep dependencies of features



>>> On 15.02.16 at 20:07, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 15/02/16 16:27, Jan Beulich wrote:
>>>>> On 15.02.16 at 17:09, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 15/02/16 15:52, Jan Beulich wrote:
>>>>>>> --- a/xen/tools/gen-cpuid.py
>>>>>>> +++ b/xen/tools/gen-cpuid.py
>>>>>>> @@ -138,6 +138,61 @@ def crunch_numbers(state):
>>>>>>>      state.hvm_shadow = featureset_to_uint32s(state.raw_hvm_shadow, 
>>>>> nr_entries)
>>>>>>>      state.hvm_hap = featureset_to_uint32s(state.raw_hvm_hap, 
>>>>>>> nr_entries)
>>>>>>>  
>>>>>>> +    deps = {
>>>>>>> +        XSAVE:
>>>>>>> +        (XSAVEOPT, XSAVEC, XGETBV1, XSAVES, AVX, MPX),
>>>>>>> +
>>>>>>> +        AVX:
>>>>>>> +        (FMA, FMA4, F16C, AVX2, XOP),
>>>>>> I continue to question whether namely XOP, but perhaps also the
>>>>>> others here except maybe AVX2, really is depending on AVX, and
>>>>>> not just on XSAVE.
>>>>> I am sure we have had this argument before.
>>>> Indeed, hence the "I continue to ...".
>>>>
>>>>> All VEX encoded SIMD instructions (including XOP which is listed in the
>>>>> same category by AMD) are specified to act on 256bit AVX state, and
>>>>> require AVX enabled in xcr0 to avoid #UD faults.  This includes VEX
>>>>> instructions encoding %xmm registers, which explicitly zero the upper
>>>>> 128bits of the associated %ymm register.
>>>>>
>>>>> This is very clearly a dependency on AVX, even if it isn't written in
>>>>> one clear concise statement in the instruction manuals.
>>>> The question is what AVX actually means: To me it's an instruction set
>>>> extension, not one of machine state. The machine state extension to
>>>> me is tied to XSAVE (or more precisely to XSAVE's YMM state). (But I
>>>> intentionally say "to me", because I can certainly see why this may be
>>>> viewed differently.)
>>> The AVX feature bit is also the indicator that the AVX bit may be set in
>>> XCR0, which links it to machine state and not just instruction sets.
>> No, it's not (and again - there's no bit named AVX in XCR0):
> 
> (and again - Intel disagree) The Intel manual uniformly refers to
> XCR0.AVX (bit 2).  AMD uses XCR0.YMM.

Interesting. I'm sure early documentation was different, which
would be in line with the bits named YMM and ZMM etc in our
code base. But anyway.

>>  Which
>> bits can be set in XCR0 is enumerated by CPUID[0xd].EDX:EAX,
>> which is - surprise, surprise - the so called XSTATE leaf (i.e. related
>> to XSAVE, and not to AVX).
> 
> In hardware, all these bits are almost certainly hardwired on or off. 
> Part of the issue here is that with virtualisation, there are a whole
> lot more combinations than exist on real hardware.
> 
> Whether right or wrong, the values for guests values for
> CPUID[0xd].EDX:EAX are now generated from the guest featureset.  This is
> based on my assumption that that's how real hardware actually works, and
> prevents the possibility of them getting out of sync.

Which I agree with. In this context, however, I wonder how you
mean to do what you say above. I don't recall having see any
related code, but of course this may well be in one of the patches
I didn't yet get around to look at.

>>>>  Note how you yourself have recourse to XCR0,
>>>> which is very clearly tied to XSAVE and not AVX, above (and note also
>>>> that there's nothing called AVX to be enabled in XCR0, it's YMM that
>>>> you talk about).
>>> The key point is this.  If I choose to enable XSAVE and disable AVX for
>>> a domain, that domain is unable to FMA/FMA4/F16C instructions.  It
>>> therefore shouldn't see the features.
>> Are you sure? Did you try?
> 
> Yes
> 
> void test_main(void)
> {
>     printk("AVX Testing\n");
> 
>     write_cr4(read_cr4() | X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT | 
> X86_CR4_OSXSAVE);
> 
>     asm volatile ("xsetbv" :: "a" (0x7), "d" (0), "c" (0));
>     asm volatile ("vfmadd132pd %xmm0, %xmm1, %xmm2");
> 
>     asm volatile ("xsetbv" :: "a" (0x3), "d" (0), "c" (0));
>     asm volatile ("vfmadd132pd %xmm0, %xmm1, %xmm2");

Here you clear the bit in XCR0, which wasn't the question. The
question was whether VFMADD... would fault when the CPUID
AVX bit is clear.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.