Xen project Mailing List

Re: [PATCH] x86/HVM: don't mark evtchn upcall vector as pending when vLAPIC is disabled

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 24 Nov 2022 08:59:00 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/g0mthTgwGBQHAZGnHS8c+Z1/zECCof3xKXkMQyy7Qk=; b=FktN/+BfS3KxXgltZRqJXWAk39ydZD9ihoFndkbEvzYt75/EkofJzEXf+GgNyvHAkAMi/GBYDqR0wMRHkjjZleyJucdd4l/sh+trrmqPNxLBxMw4ZoOOej3zLk71onygpxlpk87KNKcfzwsmEmzHs6nvAbj/u1tvHX5x2Xr3eQDEi9UBRguhBcO7LA4nhm7G2zgd7qokGDOTaaWkdN4L978Duo22ayB1X/5zZa9EaiFkwHFhUdVSmO1mYw3ccs2dBYJTuMhmXj2HwlkbXLcdVxJArOPAWbfY2C1Jd7kGQkhf4ySY2OMtZO8/OxxisdOQ9YLp4pdtshsz+ceaQNej0A==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UKfM5pDEUMfewupfXXoG7TWQKw6sf5jaPh3qr1NfIglQDLAGPOkSK2R+YmO4bFZoG/8JY/ICPMUAYV4WJp9HTe/l8tkgJyyY48a88c0AsWadKi+8+PDiaKUt1DtDvXkBG3SpPO6wHXj9FcI+CUCS343Ffm5pI05rPnXW51hBPHBNz6tUWMkw6t6UlQvlMION9/H2dAeK7C7VMabsKTbPJyV2hJeohR8e6EoaRZwWbXUgKcjAy4C2H8zpEVpkrKvwNopqbv8j4V6d/0YJ968zsk/4FiDM51Ess9iWPVXddiUym2Jb6AwjYUFvMpa0YWYJhyIsHbs5jnK/tq2AXaTolA==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 24 Nov 2022 07:59:40 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.11.2022 13:03, Roger Pau Monné wrote: > On Mon, Nov 21, 2022 at 01:34:38PM +0100, Jan Beulich wrote: >> On 21.11.2022 13:23, Andrew Cooper wrote: >>> On 21/11/2022 08:56, Jan Beulich wrote: >>>> On 18.11.2022 15:27, Andrew Cooper wrote: >>>>> I even got as far as writing that maybe leaving it as-is was the best >>>>> option (principle of least surprise for Xen developers), but our >>>>> "friend" apic acceleration strikes again. >>>>> >>>>> Xen doesn't always get a VMExit when the guest clears SW_DISABLE, >>>>> because microcode may have accelerated it. >>>> But as per "APIC-Write Emulation" in the SDM we'd still get an APIC-write >>>> VM exit. >>> >>> Intel isn't the only accelerated implementation, and there future >>> details not in the public docs. >>> >>> There will be an implementation we will want to support where Xen >>> doesn't get a vmexit for a write to SPIV. >> >> I see. >> >>>> If we didn't, how would our internal accounting of APIC enabled >>>> state (VLAPIC_SW_DISABLED) work? >>> >>> It doesn't. >>> >>> One of many problems on the "known errors" list from an incomplete >>> original attempt to get acceleration working. >>> >>> There's no good reason to cache those disables in the first place (both >>> are both trivially available from other positions in memory), and >>> correctness reasons not to. >>> >>>> And the neighboring (to where I'm adding >>>> the new code) pt_may_unmask_irq() call then also wouldn't occur. >>>> >>>> I'm actually pretty sure we do too much in this case - in particular none >>>> of the vlapic_set_reg() should be necessary. But we certainly can't get >>>> away with doing nothing, and hence we depend on that VM exit to actually >>>> occur. >>> >>> We must do exactly and only what real hardware does, so that the state >>> changes emulated by Xen are identical to those accelerated by microcode. >>> >>> Other than that, I really wouldn't make any presumptions about the >>> existing vLAPIC logic being correct. >>> >>> It is, at best, enough of an approximation to the spec for major OSes to >>> function, with multiple known bugs and no coherent testing. >> >> But can we leave resolving of the wider issue then separate, and leave >> the change here as it presently is? Yes, mimic-ing the same behavior >> later may be "interesting", but if we can't achieve consistent behavior >> with yet more advanced acceleration, maybe we simply can't use that >> (perhaps until a complete overhaul of everything involved in LAPIC >> handling, possibly including a guest side indicator that they're happy >> without the extra signaling, at which point that yet-more-advanced >> acceleration could then be enabled for that guest). >> >> Otherwise - do you have any suggestion as to alternative logic which I >> might use in this patch? > > Maybe the underlying issue is that we allow executing > HVMOP_set_evtchn_upcall_vector against remote vCPU. This could be > solved by only allowing HVMOP_set_evtchn_upcall_vector against the > current vCPU and with the LAPIC enabled, but I guess we are too late > for that. Allowing things like this ahead of bringing a vCPU online can be helpful for the OS side implementation, I think. > Actually, what about only injecting the spurious event if the vCPU is > online, ie: > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index ae4368ec4b..4b84706579 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -4105,7 +4105,8 @@ static int hvmop_set_evtchn_upcall_vector( > printk(XENLOG_G_INFO "%pv: upcall vector %02x\n", v, op.vector); > > v->arch.hvm.evtchn_upcall_vector = op.vector; > - hvm_assert_evtchn_irq(v); > + if ( is_vcpu_online(v) ) > + hvm_assert_evtchn_irq(v); While this would match what hvm_set_callback_via() does, see my post- commit-message remark suggesting to key this to evtchn_upcall_pending. Tying the call to the vCPU being online looks pretty arbitrary to me, the more that this would continue to be - racy with the vCPU coming online, and perhaps already being past VCPUOP_initialise - after all VCPUOP_up is little more than clearing VPF_down, - problematic wrt evtchn_upcall_pending, once set, preventing event injection later on. As you may have inferred already, I'm inclined to suggest to drop the the is_vcpu_online() check from hvm_set_callback_via(). One related question here is whether vlapic_do_init() shouldn't have the non-architectural side effect of clearing evtchn_upcall_pending. While this again violates the principle of the hypervisor only ever setting that bit, it would deal with the risk of no further event injection once the flag is set, considering that vlapic_do_init() clears IRR (and ISR). Jan > I think that should fix the issue seen on Linux. We would > additionally need to fix hvm_assert_evtchn_irq() to only set the > vector if the vLAPIC is enabled. > > Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.