Xen project Mailing List

Re: [Xen-devel] [PATCH v3 01/34] x86emul: support AVX512 opmask insns

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Fri, 26 Oct 2018 06:34:02 -0600

Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>

Delivery-date: Fri, 26 Oct 2018 12:34:25 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 26.10.18 at 14:19, <andrew.cooper3@xxxxxxxxxx> wrote: > On 26/10/18 12:59, Jan Beulich wrote: >>>>> On 26.10.18 at 13:29, <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 26/10/18 10:03, Jan Beulich wrote: >>>>>>> On 25.10.18 at 20:32, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>> On 18/09/18 12:53, Jan Beulich wrote: >>>>>> @@ -1187,6 +1188,11 @@ static int _get_fpu( >>>>>> return X86EMUL_UNHANDLEABLE; >>>>>> break; >>>>>> >>>>>> + case X86EMUL_FPU_opmask: >>>>>> + if ( !(xcr0 & X86_XCR0_SSE) || !(xcr0 & X86_XCR0_OPMASK) ) >>>>>> + return X86EMUL_UNHANDLEABLE; >>>>>> + break; >>>>> I see this follows the pattern from X86EMUL_FPU_ymm, but by the SSE >>>>> check? It is not relevant at this point - if xcr0.opmask is set, the >>>>> opmask instructions should be usable. >>>> I would agree with you from a functional POV, but please see the >>>> last row of the table named "OS XSAVE Enabling Requirements of >>>> Instruction Categories" in SDM Vol 2. >>> Hmm. That table is in contradiction to Vol 3 Figure 2-8 and associated >>> description, which enumerates the #GP conditions of XSETBV. >>> >>> In particular, it suggests that OPMASK must be set in union with the ZMM >>> bits, and cannot be set without YMM. >> Hmm, true. However ... >> >>> IMO, we can require/expect get_xcr0() to pass an architecturally valid >>> result. I don't think we should be checking the transitive closure of >>> feature dependences here. >> ... I'd prefer if the code remained consistent, and hence the >> analogy with X86EMUL_FPU_ymm should remain, the more that >> this way it's also more consistent with Vol 2's #UD condition table. >> >> Furthermore I consider some of the restrictions in table 2-8 quite >> a bit too restrictive, and hence I wouldn't be surprised if they >> got relaxed at some point. It has puzzled me from the very >> beginning that e.g. Hi16_ZMM needs to be set by 32-bit OSes >> despite there being no means to access the upper 16 ZMM >> registers. > > It almost certainly simplifies the circuitry. The high parts will be 0 > (due to the zero extending property of VEX/EVEX) and unallocated in the > physical register file. High parts? We're talking of entirely distinct registers (zmm16 - zmm31) here. But I agree it's likely a simplification for them, just perhaps elsewhere. >> By keeping the code as is we'd have no issue with >> any such relaxation. > > Right. For now, lets leave this consistent with the surrounding code. > Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Thanks. > I think we need need Intel to fix their docs, and based on that, I think > we should do some uniform changes to handling to remove unnecessary > dependency links. I'll fire off a request. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.