[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 9/9] x86emul+VMX: support {RD,WR}MSRLIST


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 6 Apr 2023 23:03:30 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7nbgt0HD9hbY8GSlnvGTyg7OhKOvJp4JMkJSv7rMRKA=; b=Cq+d5o/MMMx9YXycIlJr8MMzZC6Cuy5SY1WPE72xvPwktgQj1FOC6CfOsgmF9yvYotHq0HzZIk9f6yfoVeNe5S6MXfRlPbe4XuhOZIPb/GhDI6F4r/gVZdRST4xJrT51l8LfsUAs9VsmABI1JQCpyTy1RbmYFXEUSvSto9bU5MBjfw7Hrg/dCT6VvQoIWl8l8RyfTUc1al36VZmVXBPhxmGKxBYdk86/VkT7qT7+FTVjuGJ7GgbwaW2+egzcgMeTDsUbKxuf1ykGd9RJO0DBeVYStEGAtL+Z6cxQYaKP1scL2Y4+aAUtJluw0JdGZWao4+7oA5z2db2AzqMZLlYcYw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JKJr553KU7bXVVQFWZ/Lof42RDWBPBh6kcPFPh5ux3JAa/vEeIBAbPeJdkGEz6jIEGZcsl00O1spx1z4SNXsWoK9AKcZhgsV+smE7nDyBYuB7U0SBwuAj/GHaCIqJ7+LBO/Xfk11ga23avco7jtzDu/SiXttgRI3Zku6eN94KhfaxeE6yXbWXFhwuHdPDZIey1Fscka2aDFD1cQEmTPtaptTSqdpROrXaHxjRiqtuRaUPknZ11OtmcbYMX0JT4Q/VGyL0wm8eHQCs9YX8Es0zi4S820jNIpzpRmJihoriEkvlJQ70rMPfixpTNaYYUga7+oqnTv6bZddOo42ZrJ2Cg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>
  • Delivery-date: Thu, 06 Apr 2023 22:04:15 +0000
  • Ironport-data: A9a23:LONriqtjJIHCYm/izojw7imfiefnVIZfMUV32f8akzHdYApBsoF/q tZmKW/Ua66OajSnKIx+aty38B8EvpCDytQyGQs/ry49QyoR+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVKiffHg3HVQ+IMsYoUoLs/YjhYJ1isSODQqIu Nfjy+XSI1bg0DNvWo4uw/vrRChH4bKj6Fv0gnRkPaoQ5AOGySFPZH4iDfrZw0XQE9E88tGSH 44v/JnhlkvF8hEkDM+Sk7qTWiXmlZaLYGBiIlIPM0STqkAqSh4ai87XB9JFAatjsB2bnsgZ9 Tl4ncfYpTHFnEH7sL91vxFwS0mSNEDdkVPNCSDXXce7lyUqf5ZwqhnH4Y5f0YAwo45K7W9yG fMwNGosdy2MqcaP8JG2ReRtiNQKB5H7BdZK0p1g5Wmx4fcOZ7nmGv+PwOACmTA6i4ZJAOrUY NcfZXx3dhPcbhZTO1ARTpUjgOOvgXq5eDpdwL6XjfNvvy6Pk0ovjv6xaLI5efTTLSlRtm+eq njL4CLSBRYCOcbE4TGE7mitlqnEmiaTtIc6TeXnrqM33QPMroAVIEAGDHGf/Pqcs0WBBfRaA Uc+4RMwrpFnoSRHSfG4BXVUukWsvBQRRt5RGO0S8xyWx+zf5APxLngJSHtNZcIrsOcyRCc2z RmZktXxHzttvbaJD3WH+d+8sjeaKSUTa2gYakcsTgYb4t+lvIA6iDrOSMpuFOi+ididMTPtx XaMpSs3hbQWhOYK0bm2+RbMhDfEm3TSZgs85wGSW33/6Ap8PdShf9bwtQCd6utcJoGESFXHp GIDh8WV8OEJC9eKiTCJR+IOWrqu4p5pLQHhvLKmJLF5nxzFxpJpVdo4DO1WTKuxDvs5RA==
  • Ironport-hdrordr: A9a23:QGDupazBMF60XE0X39P6KrPw6L1zdoMgy1knxilNoHxuH/Bw9v re+cjzsCWftN9/Yh4dcLy7VpVoIkmsl6Kdg7NwAV7KZmCP1FdARLsI0WKI+UyCJ8SRzI9gPa cLSdkFNDXzZ2IK8PoTNmODYqodKNrsytHWuQ/HpU0dKT2D88tbnn9E4gDwKDwQeCB2QaAXOb C7/cR9qz+paR0sH7+G7ilsZZmkmzXT/qiWGCI7Ow==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04/04/2023 3:55 pm, Jan Beulich wrote:
> These are "compound" instructions to issue a series of RDMSR / WRMSR
> respectively. In the emulator we can therefore implement them by using
> the existing msr_{read,write}() hooks. The memory accesses utilize that
> the HVM ->read() / ->write() hooks are already linear-address
> (x86_seg_none) aware (by way of hvmemul_virtual_to_linear() handling
> this case).
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> TODO: Use VMX tertiary execution control (once bit is known; see
>       //todo-s) and then further adjust cpufeatureset.h.
>
> RFC: In vmx_vmexit_handler() handling is forwarded to the emulator
>      blindly. Alternatively we could consult the exit qualification and
>      process just a single MSR at a time (without involving the
>      emulator), exiting back to the guest after every iteration. (I
>      don't think a mix of both models makes a lot of sense.)

{RD,WR}MSRLIST are supposed to be used for context switch paths.  They
really shouldn't be exiting in the common case.

What matters here is the conditional probability of a second MSR being
intercepted too, given that one already has been.  And I don't have a
clue how to answer this.

I would not expect Introspection to be intercepting a fastpath MSR. 
(And if it does, frankly it can live with the consequences.)

For future scenarios such as reloading PMU/LBR/whatever, these will be
all-or-nothing and we'd expect to have them eagerly in context anyway.

If I were going to guess, I'd say that probably MSR_XSS or MSR_SPEC_CTRL
will be giving us the most grief here, because they're both ones that
are liable to be touched on a context switch path, and have split
host/guest bits.

> RFC: For PV priv_op_ops would need to gain proper read/write hooks,
>      which doesn't look desirable (albeit there we could refuse to
>      handle anything else than x86_seg_none); we may want to consider to
>      instead not support the feature for PV guests, requiring e.g. Linux
>      to process the lists in new pvops hooks.

Ah - funny you should ask.  See patch 2.  These are even better reasons
not to support on PV guests.

> RFC: I wasn't sure whether to add preemption checks to the loops -
>      thoughts?

I'd be tempted to.  Mostly because a guest can exert 64* longest MSR
worth of pressure here, along with associated emulation overhead.

64* "write hypercall page" sounds expensive.  So too does 64* MSR_PAT,
given all the EPT actions behind the scenes.

Its probably one of those

> With the VMX side of the spec still unclear (tertiary execution control
> bit unspecified in ISE 046) we can't enable the insn yet for (HVM) guest
> use. The precise behavior of MSR_BARRIER is also not spelled out, so the
> (minimal) implementation is a guess for now.

MSRs are expensive for several reasons.  The %ecx decode alone isn't
cheap, nor is the reserved bit checking, and that's before starting the
action itself.

What the list form gets you is the ability to pipeline the checks on the
following MSR while performing the action of the current one.  I suspect
there are plans to try and parallelise the actions too if possible.

As I understand it, BARRIER just halts the piplineing of processing the
next MSR.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.