Xen project Mailing List

Re: [PATCH v2 3/3] x86/vmx: implement Notify VM Exit

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 16 Jun 2022 13:17:31 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jH1HBwAj15WT5WAef5LD5xjcQGRwUX4vRGD4OkaMjRs=; b=OIA3fvwrYXHDo2BGy8Fgre4c4nLXXQu3EkqLx7JvHB/yICAUYEMzPnuW0NMJUXPH24iq/D1T1fM/tSLOwxRl6L7gZ+dkbLPdhFnLQz03r5oytEFFEu38Cd2XbK/4sqy/WqoscR7XNe9bnsEdLvulOa1pxqK8ykXd8bdSw2TrLwr7HHKWNtPkO2MGlNYPGtTIn2tlmF0OVshyJr3nEn5oJtt96lxR2NB7O0IKPH4xwATUxHGlyjd0AvuK2NDO4+kIlKn22ilcEYFzMqYKMcGuG0SAFduM6XgU9Nvd8Y2CByQHL/GZnUxztUoJFUsg2DLK3U2nbmscWFaEIIZWTS105w==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QM6z2T6JK6u+dRb8KSyjiUAYUoGKqnv9m6y7W/cy3YhhsPoLIch10OjKnrao7xP21ACK3YDW3ZQ9QdIOyxlu1SvYmGNPgwqjVK62jGghWOecdE5pE8QKAIM0CnbWi+1RYHcM4YJ75N3iTG8Ptg0yTLOUS4qwNLOvMsOek92yCYaJCCrl0RT3H/xo8fYfkzbZkABBdhgL/EYfkpAyhQxD/l2sVJeY8Or6Gh2gXzMRtXIpFYPzbIFLR7pFTM00Qe2fB+qIZBtUA6YhnqEpEUET2NxEV5i/2OpVDdTvhn78z7zZQx4vVoyKKbuL73xcSa2xpdv8UKOksZl+gjepwZ7UUA==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Cooper, Andrew" <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, "Beulich, Jan" <JBeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "Qiang, Chenyi" <chenyi.qiang@xxxxxxxxx>

Delivery-date: Thu, 16 Jun 2022 11:17:54 +0000

Ironport-data: A9a23:+6FdD6N2j62yMtfvrR1VlsFynXyQoLVcMsEvi/4bfWQNrUolhjVSx jAcXmqHPq2OMDPwL410YIzi8U8H65GAx9BrQQto+SlhQUwRpJueD7x1DKtR0wB+jCHnZBg6h ynLQoCYdKjYdleF+lH1dOKJQUBUjclkfJKlYAL/En03FFYMpBsJ00o5wbZn29Aw27BVPivW0 T/Mi5yHULOa82Yc3lI8s8pvfzs24ZweEBtB1rAPTagjUG32zhH5P7pGTU2FFFPqQ5E8IwKPb 72rIIdVXI/u10xF5tuNyt4Xe6CRK1LYFVDmZnF+A8BOjvXez8CbP2lS2Pc0MC9qZzu1c99Zy 5JdmLi1bEASGrSQsfpecDNBORBQBPgTkFPHCSDXXc276WTjKiGp79AwSUY8MMsf5/p9BnxI+ boAMjcRYxufhuWwhrWmVu1rgcdlJ87uVG8dkig4kXeFUrB7ENaaHP2iCdxwhV/cguhUGvnTf YwBYCdHZxXceRxffFwQDfrSmc/33SSuLGUH8Dp5o4IquGrNkSV92YS0LeOWXOfaXOxLh3ix8 zeuE2PRR0ty2Mak4SqE+3W9j+iJmSLTWYQOGbn+/flv6HWQy3ISDlsKVFK9ifi/lkO6HdlYL iQ86ico6KQ/6kGvZt38RAGj5m6JuAYGXNhdGPF87xuCooL2yQuEAmkPThZadccr8sQxQFQC1 EKNnt7vLSxitvuSU3313qyPsTq4NCwRLGkDTSwJVw0I55/kuo5bpg3LZsZuFuiylNKdMTPtx XaMpSs3hbQWhOYK0bm2+RbMhDfEjpPJQwgk50POX2uj4St4YpKoY8qj7l2z0BpbBIOQT13Es H1ancGbtboKFcvUy3TLR/gRFra04frDKCfbnVNkA5gm8XKq5mKneodTpjp5IS+FL/o5RNMgW 2eL0Ss52XOZFCHCgXNfC25pN/kX8A==

Ironport-hdrordr: A9a23:2v/R4KDhFzEd64blHeglsceALOsnbusQ8zAXPh9KJCC9I/bzqy nxpp8mPH/P5wr5lktQ/OxoHJPwOU80kqQFmrX5XI3SJTUO3VHFEGgM1+vfKlHbak7DH6tmpN 1dmstFeaLN5DpB/KHHCWCDer5PoeVvsprY49s2p00dMT2CAJsQizuRZDzrcHGfE2J9dOcE/d enl4J6T33KQwVlUu2LQl0+G8TTrdzCk5zrJTYAGh4c8QGLyRel8qTzHRS01goXF2on+8ZpzU H11yjCoomzufCyzRHRk0fV8pRtgdPkjv9OHtaFhMQ5IijlziyoeINicbufuy1dmpDl1H8a1P 335zswNcV67H3cOkmzvBvWwgHllA0j7nfzoGXo9kfLkIjcfnYXGsBBjYVWfl/y8Ew7puxx16 pNwiawq4dXJQmoplWy2/H4EzVR0makq3srluAey1ZFV5EFVbNXpYsDuGtIDZY7Gj7g4oxPKp ggMCjl3ocXTbqmVQGbgoE2q+bcHEjbXy32DnTqg/blkgS/xxtCvg4lLM92pAZ1yHtycegB2w 3+CNUYqFh/dL5pUUtDPpZwfSKWMB26ffueChPaHbzYfJt3SU7lmtrQ3Igfwt2MVdgh8KYS8a 6xJW+w81RCNn7TNQ==

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Ping? On Thu, Jun 09, 2022 at 12:09:18PM +0200, Roger Pau Monné wrote: > On Thu, Jun 09, 2022 at 03:39:33PM +0800, Xiaoyao Li wrote: > > On 6/9/2022 3:04 PM, Tian, Kevin wrote: > > > +Chenyi/Xiaoyao who worked on the KVM support. Presumably > > > similar opens have been discussed in KVM hence they have the > > > right background to comment here. > > > > > > > From: Roger Pau Monne <roger.pau@xxxxxxxxxx> > > > > Sent: Thursday, May 26, 2022 7:12 PM > > > > > > > > Under certain conditions guests can get the CPU stuck in an unbounded > > > > loop without the possibility of an interrupt window to occur on > > > > instruction boundary. This was the case with the scenarios described > > > > in XSA-156. > > > > > > > > Make use of the Notify VM Exit mechanism, that will trigger a VM Exit > > > > if no interrupt window occurs for a specified amount of time. Note > > > > that using the Notify VM Exit avoids having to trap #AC and #DB > > > > exceptions, as Xen is guaranteed to get a VM Exit even if the guest > > > > puts the CPU in a loop without an interrupt window, as such disable > > > > the intercepts if the feature is available and enabled. > > > > > > > > Setting the notify VM exit window to 0 is safe because there's a > > > > threshold added by the hardware in order to have a sane window value. > > > > > > > > Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > > > --- > > > > Changes since v1: > > > > - Properly update debug state when using notify VM exit. > > > > - Reword commit message. > > > > --- > > > > This change enables the notify VM exit by default, KVM however doesn't > > > > seem to enable it by default, and there's the following note in the > > > > commit message: > > > > > > > > "- There's a possibility, however small, that a notify VM exit happens > > > > with VM_CONTEXT_INVALID set in exit qualification. In this case, the > > > > vcpu can no longer run. To avoid killing a well-behaved guest, set > > > > notify window as -1 to disable this feature by default." > > > > > > > > It's not obviously clear to me whether the comment was meant to be: > > > > "There's a possibility, however small, that a notify VM exit _wrongly_ > > > > happens with VM_CONTEXT_INVALID". > > > > > > > > It's also not clear whether such wrong hardware behavior only affects > > > > a specific set of hardware, > > > > I'm not sure what you mean for a specific set of hardware. > > > > We make it default off in KVM just in case that future silicon wrongly sets > > VM_CONTEXT_INVALID bit. Becuase we make the policy that VM cannot continue > > running in that case. > > > > For the worst case, if some future silicon happens to have this kind silly > > bug, then the existing product kernel all suffer the possibility that their > > VM being killed due to the feature is default on. > > That's IMO a weird policy. If there's such behavior in any hardware > platform I would assume Intel would issue an errata, and then we would > just avoid using the feature on affected hardware (like we do with > other hardware features when they have erratas). > > If we applied the same logic to all new Intel features we won't use > any of them. At least in Xen there are already combinations of vmexit > conditions that will lead to the guest being killed. > > > > > in a way that we could avoid enabling > > > > notify VM exit there. > > > > > > > > There's a discussion in one of the Linux patches that 128K might be > > > > the safer value in order to prevent false positives, but I have no > > > > formal confirmation about this. Maybe our Intel maintainers can > > > > provide some more feedback on a suitable notify VM exit window > > > > value. > > > > The 128k is the internal threshold for SPR silicon. The internal threshold > > is tuned by Intel for each silicon, to make sure it's big enough to avoid > > false positive even when user set vmcs.notify_window to 0. > > > > However, it varies for different processor generations. > > > > What is the suitable value is hard to say, it depends on how soon does VMM > > want to intercept the VM. Anyway, Intel ensures that even value 0 is safe. > > Ideally we need a fixed default value that's guaranteed to work on all > possible hardware that supports the feature, or alternatively a way to > calculate a sane default window based on the hardware platform. > > Could we get some wording added to the ISE regarding 0 being a > suitable default value to use because hardware will add a threshold > internally to make the value safe? > > Thanks, Roger. >

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.