[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 05/23] x86emul: support AVX512F gather insns


  • To: Jan Beulich <JBeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 4 Jul 2019 19:26:17 +0100
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Wei Liu <wl@xxxxxxx>, RogerPau Monne <roger.pau@xxxxxxxxxx>
  • Delivery-date: Thu, 04 Jul 2019 18:26:31 +0000
  • Ironport-sdr: 0LYiJ56n2lCIdOWi85j25PokQKZSFaxjloJe2tHVt5oNMcgXGe8UoI7cvOqw8vnvZrXinGpJyy lSsPGUpXoi8i6obyNxqfb2Cl6jrIKoP++RBZPpVsPc6PNzVZSTzjpy3DzeofEoy4Fv37qorTKn bM1xGAy+h2/meED2CcS/betD3v5FVG9+RKlJ8iFB7K3Qc7iUes3y7GO7Q6Uc++xUtgm8k43Z2w tXuNGIIcgUlsZyKDo+Y2zYgphbhBr+7EEmONLvlL/qLeRyBPU+4CG1EpJFrnP0Oil6p3/8Ki0/ vgQ=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 04/07/2019 15:25, Jan Beulich wrote:
> On 04.07.2019 16:16, Andrew Cooper wrote:
>> On 04/07/2019 15:10, Andrew Cooper wrote:
>>> On 01/07/2019 12:18, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> @@ -9100,6 +9100,133 @@ x86_emulate(
>>>>            put_stub(stub);
>>>>    
>>>>            if ( rc != X86EMUL_OKAY )
>>>> +            goto done;
>>>> +
>>>> +        state->simd_size = simd_none;
>>>> +        break;
>>>> +    }
>>>> +
>>>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x90): /* vpgatherd{d,q} 
>>>> mem,[xyz]mm{k} */
>>>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x91): /* vpgatherq{d,q} 
>>>> mem,[xyz]mm{k} */
>>>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x92): /* vgatherdp{s,d} 
>>>> mem,[xyz]mm{k} */
>>>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x93): /* vgatherqp{s,d} 
>>>> mem,[xyz]mm{k} */
>>>> +    {
>>>> +        typeof(evex) *pevex;
>>>> +        union {
>>>> +            int32_t dw[16];
>>>> +            int64_t qw[8];
>>>> +        } index;
>>>> +        bool done = false;
>>>> +
>>>> +        ASSERT(ea.type == OP_MEM);
>>>> +        generate_exception_if((!evex.opmsk || evex.brs || evex.z ||
>>>> +                               evex.reg != 0xf ||
>>>> +                               modrm_reg == state->sib_index),
>>>> +                              EXC_UD);
>>>> +        avx512_vlen_check(false);
>>>> +        host_and_vcpu_must_have(avx512f);
>>>> +        get_fpu(X86EMUL_FPU_zmm);
>>>> +
>>>> +        /* Read destination and index registers. */
>>>> +        opc = init_evex(stub);
>>>> +        pevex = copy_EVEX(opc, evex);
>>>> +        pevex->opcx = vex_0f;
>>>> +        opc[0] = 0x7f; /* vmovdqa{32,64} */
>>>> +        /*
>>>> +         * The register writeback below has to retain masked-off 
>>>> elements, but
>>>> +         * needs to clear upper portions in the index-wider-than-data 
>>>> cases.
>>>> +         * Therefore read (and write below) the full register. The 
>>>> alternative
>>>> +         * would have been to fiddle with the mask register used.
>>>> +         */
>>>> +        pevex->opmsk = 0;
>>>> +        /* Use (%rax) as destination and modrm_reg as source. */
>>>> +        pevex->b = 1;
>>>> +        opc[1] = (modrm_reg & 7) << 3;
>>>> +        pevex->RX = 1;
>>>> +        opc[2] = 0xc3;
>>>> +
>>>> +        invoke_stub("", "", "=m" (*mmvalp) : "a" (mmvalp));
>>>> +
>>>> +        pevex->pfx = vex_f3; /* vmovdqu{32,64} */
>>>> +        pevex->w = b & 1;
>>>> +        /* Switch to sib_index as source. */
>>>> +        pevex->r = !mode_64bit() || !(state->sib_index & 0x08);
>>>> +        pevex->R = !mode_64bit() || !(state->sib_index & 0x10);
>>>> +        opc[1] = (state->sib_index & 7) << 3;
>>>> +
>>>> +        invoke_stub("", "", "=m" (index) : "a" (&index));
>>>> +        put_stub(stub);
>>>> +
>>>> +        /* Clear untouched parts of the destination and mask values. */
>>>> +        n = 1 << (2 + evex.lr - ((b & 1) | evex.w));
>>>> +        op_bytes = 4 << evex.w;
>>>> +        memset((void *)mmvalp + n * op_bytes, 0, 64 - n * op_bytes);
>>>> +        op_mask &= (1 << n) - 1;
>>>> +
>>>> +        for ( i = 0; op_mask; ++i )
>>>> +        {
>>>> +            signed long idx = b & 1 ? index.qw[i] : index.dw[i];
>>> No signed.  However, surely this needs to be int64_t anyway, to function
>>> correctly in a 32bit build of the test harness?
>>>
>>> The SDM says VPGATHERQD is encodable in 32bit with quadword indices.
>>>
>>> ~Andrew
>>>
>>>> +
>>>> +            if ( !(op_mask & (1 << i)) )
>>>> +                continue;
>>>> +
>>>> +            rc = ops->read(ea.mem.seg,
>>>> +                           truncate_ea(ea.mem.off + (idx << 
>>>> state->sib_scale)),
>> Actually, what SDM says is:
>>
>> "The scaled index may require more bits to represent than the address
>> bits used by the processor (e.g., in 32-bit mode, if the scale is
>> greater than one). In this case, the most significant bits beyond the
>> number of address bits are ignored."
>>
>> That reads as if it is it means "ea.mem.off + (u32)(idx <<
>> state->sib_scale)".
> Why "reads as if"? What else could a 32-bit address computation look
> like? (In essence truncate_ea() will truncate to 32 bits anyway when
> 32-bit addressing is in use, so the inner truncation is simply
> redundant.)

Ok - I think it will DTRT.

Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.