[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/HVM: don't mark evtchn upcall vector as pending when vLAPIC is disabled


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 24 Nov 2022 08:59:00 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/g0mthTgwGBQHAZGnHS8c+Z1/zECCof3xKXkMQyy7Qk=; b=FktN/+BfS3KxXgltZRqJXWAk39ydZD9ihoFndkbEvzYt75/EkofJzEXf+GgNyvHAkAMi/GBYDqR0wMRHkjjZleyJucdd4l/sh+trrmqPNxLBxMw4ZoOOej3zLk71onygpxlpk87KNKcfzwsmEmzHs6nvAbj/u1tvHX5x2Xr3eQDEi9UBRguhBcO7LA4nhm7G2zgd7qokGDOTaaWkdN4L978Duo22ayB1X/5zZa9EaiFkwHFhUdVSmO1mYw3ccs2dBYJTuMhmXj2HwlkbXLcdVxJArOPAWbfY2C1Jd7kGQkhf4ySY2OMtZO8/OxxisdOQ9YLp4pdtshsz+ceaQNej0A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UKfM5pDEUMfewupfXXoG7TWQKw6sf5jaPh3qr1NfIglQDLAGPOkSK2R+YmO4bFZoG/8JY/ICPMUAYV4WJp9HTe/l8tkgJyyY48a88c0AsWadKi+8+PDiaKUt1DtDvXkBG3SpPO6wHXj9FcI+CUCS343Ffm5pI05rPnXW51hBPHBNz6tUWMkw6t6UlQvlMION9/H2dAeK7C7VMabsKTbPJyV2hJeohR8e6EoaRZwWbXUgKcjAy4C2H8zpEVpkrKvwNopqbv8j4V6d/0YJ968zsk/4FiDM51Ess9iWPVXddiUym2Jb6AwjYUFvMpa0YWYJhyIsHbs5jnK/tq2AXaTolA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 24 Nov 2022 07:59:40 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.11.2022 13:03, Roger Pau Monné wrote:
> On Mon, Nov 21, 2022 at 01:34:38PM +0100, Jan Beulich wrote:
>> On 21.11.2022 13:23, Andrew Cooper wrote:
>>> On 21/11/2022 08:56, Jan Beulich wrote:
>>>> On 18.11.2022 15:27, Andrew Cooper wrote:
>>>>> I even got as far as writing that maybe leaving it as-is was the best
>>>>> option (principle of least surprise for Xen developers), but our
>>>>> "friend" apic acceleration strikes again.
>>>>>
>>>>> Xen doesn't always get a VMExit when the guest clears SW_DISABLE,
>>>>> because microcode may have accelerated it.
>>>> But as per "APIC-Write Emulation" in the SDM we'd still get an APIC-write
>>>> VM exit.
>>>
>>> Intel isn't the only accelerated implementation, and there future
>>> details not in the public docs.
>>>
>>> There will be an implementation we will want to support where Xen
>>> doesn't get a vmexit for a write to SPIV.
>>
>> I see.
>>
>>>> If we didn't, how would our internal accounting of APIC enabled
>>>> state (VLAPIC_SW_DISABLED) work?
>>>
>>> It doesn't.
>>>
>>> One of many problems on the "known errors" list from an incomplete
>>> original attempt to get acceleration working.
>>>
>>> There's no good reason to cache those disables in the first place (both
>>> are both trivially available from other positions in memory), and
>>> correctness reasons not to.
>>>
>>>>  And the neighboring (to where I'm adding
>>>> the new code) pt_may_unmask_irq() call then also wouldn't occur.
>>>>
>>>> I'm actually pretty sure we do too much in this case - in particular none
>>>> of the vlapic_set_reg() should be necessary. But we certainly can't get
>>>> away with doing nothing, and hence we depend on that VM exit to actually
>>>> occur.
>>>
>>> We must do exactly and only what real hardware does, so that the state
>>> changes emulated by Xen are identical to those accelerated by microcode.
>>>
>>> Other than that, I really wouldn't make any presumptions about the
>>> existing vLAPIC logic being correct.
>>>
>>> It is, at best, enough of an approximation to the spec for major OSes to
>>> function, with multiple known bugs and no coherent testing.
>>
>> But can we leave resolving of the wider issue then separate, and leave
>> the change here as it presently is? Yes, mimic-ing the same behavior
>> later may be "interesting", but if we can't achieve consistent behavior
>> with yet more advanced acceleration, maybe we simply can't use that
>> (perhaps until a complete overhaul of everything involved in LAPIC
>> handling, possibly including a guest side indicator that they're happy
>> without the extra signaling, at which point that yet-more-advanced
>> acceleration could then be enabled for that guest).
>>
>> Otherwise - do you have any suggestion as to alternative logic which I
>> might use in this patch?
> 
> Maybe the underlying issue is that we allow executing
> HVMOP_set_evtchn_upcall_vector against remote vCPU.  This could be
> solved by only allowing HVMOP_set_evtchn_upcall_vector against the
> current vCPU and with the LAPIC enabled, but I guess we are too late
> for that.

Allowing things like this ahead of bringing a vCPU online can be
helpful for the OS side implementation, I think.

> Actually, what about only injecting the spurious event if the vCPU is
> online, ie:
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ae4368ec4b..4b84706579 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4105,7 +4105,8 @@ static int hvmop_set_evtchn_upcall_vector(
>      printk(XENLOG_G_INFO "%pv: upcall vector %02x\n", v, op.vector);
>  
>      v->arch.hvm.evtchn_upcall_vector = op.vector;
> -    hvm_assert_evtchn_irq(v);
> +    if ( is_vcpu_online(v) )
> +        hvm_assert_evtchn_irq(v);

While this would match what hvm_set_callback_via() does, see my post-
commit-message remark suggesting to key this to evtchn_upcall_pending.
Tying the call to the vCPU being online looks pretty arbitrary to me,
the more that this would continue to be
- racy with the vCPU coming online, and perhaps already being past
  VCPUOP_initialise - after all VCPUOP_up is little more than clearing
  VPF_down,
- problematic wrt evtchn_upcall_pending, once set, preventing event
  injection later on.
As you may have inferred already, I'm inclined to suggest to drop the
the is_vcpu_online() check from hvm_set_callback_via().

One related question here is whether vlapic_do_init() shouldn't have
the non-architectural side effect of clearing evtchn_upcall_pending.
While this again violates the principle of the hypervisor only ever
setting that bit, it would deal with the risk of no further event
injection once the flag is set, considering that vlapic_do_init()
clears IRR (and ISR).

Jan

> I think that should fix the issue seen on Linux.  We would
> additionally need to fix hvm_assert_evtchn_irq() to only set the
> vector if the vLAPIC is enabled.
> 
> Thanks, Roger.




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.