[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/HVM: don't mark evtchn upcall vector as pending when vLAPIC is disabled
On Thu, Nov 24, 2022 at 08:59:00AM +0100, Jan Beulich wrote: > On 23.11.2022 13:03, Roger Pau Monné wrote: > > On Mon, Nov 21, 2022 at 01:34:38PM +0100, Jan Beulich wrote: > >> On 21.11.2022 13:23, Andrew Cooper wrote: > >>> On 21/11/2022 08:56, Jan Beulich wrote: > >>>> On 18.11.2022 15:27, Andrew Cooper wrote: > >>>>> I even got as far as writing that maybe leaving it as-is was the best > >>>>> option (principle of least surprise for Xen developers), but our > >>>>> "friend" apic acceleration strikes again. > >>>>> > >>>>> Xen doesn't always get a VMExit when the guest clears SW_DISABLE, > >>>>> because microcode may have accelerated it. > >>>> But as per "APIC-Write Emulation" in the SDM we'd still get an APIC-write > >>>> VM exit. > >>> > >>> Intel isn't the only accelerated implementation, and there future > >>> details not in the public docs. > >>> > >>> There will be an implementation we will want to support where Xen > >>> doesn't get a vmexit for a write to SPIV. > >> > >> I see. > >> > >>>> If we didn't, how would our internal accounting of APIC enabled > >>>> state (VLAPIC_SW_DISABLED) work? > >>> > >>> It doesn't. > >>> > >>> One of many problems on the "known errors" list from an incomplete > >>> original attempt to get acceleration working. > >>> > >>> There's no good reason to cache those disables in the first place (both > >>> are both trivially available from other positions in memory), and > >>> correctness reasons not to. > >>> > >>>> And the neighboring (to where I'm adding > >>>> the new code) pt_may_unmask_irq() call then also wouldn't occur. > >>>> > >>>> I'm actually pretty sure we do too much in this case - in particular none > >>>> of the vlapic_set_reg() should be necessary. But we certainly can't get > >>>> away with doing nothing, and hence we depend on that VM exit to actually > >>>> occur. > >>> > >>> We must do exactly and only what real hardware does, so that the state > >>> changes emulated by Xen are identical to those accelerated by microcode. > >>> > >>> Other than that, I really wouldn't make any presumptions about the > >>> existing vLAPIC logic being correct. > >>> > >>> It is, at best, enough of an approximation to the spec for major OSes to > >>> function, with multiple known bugs and no coherent testing. > >> > >> But can we leave resolving of the wider issue then separate, and leave > >> the change here as it presently is? Yes, mimic-ing the same behavior > >> later may be "interesting", but if we can't achieve consistent behavior > >> with yet more advanced acceleration, maybe we simply can't use that > >> (perhaps until a complete overhaul of everything involved in LAPIC > >> handling, possibly including a guest side indicator that they're happy > >> without the extra signaling, at which point that yet-more-advanced > >> acceleration could then be enabled for that guest). > >> > >> Otherwise - do you have any suggestion as to alternative logic which I > >> might use in this patch? > > > > Maybe the underlying issue is that we allow executing > > HVMOP_set_evtchn_upcall_vector against remote vCPU. This could be > > solved by only allowing HVMOP_set_evtchn_upcall_vector against the > > current vCPU and with the LAPIC enabled, but I guess we are too late > > for that. > > Allowing things like this ahead of bringing a vCPU online can be > helpful for the OS side implementation, I think. I thinks it's more natural to do the vector callback setup as part of the CPU bringup path, like any other CPU related setting that need to be applied, but I guess that's a question of taste. > > Actually, what about only injecting the spurious event if the vCPU is > > online, ie: > > > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > > index ae4368ec4b..4b84706579 100644 > > --- a/xen/arch/x86/hvm/hvm.c > > +++ b/xen/arch/x86/hvm/hvm.c > > @@ -4105,7 +4105,8 @@ static int hvmop_set_evtchn_upcall_vector( > > printk(XENLOG_G_INFO "%pv: upcall vector %02x\n", v, op.vector); > > > > v->arch.hvm.evtchn_upcall_vector = op.vector; > > - hvm_assert_evtchn_irq(v); > > + if ( is_vcpu_online(v) ) > > + hvm_assert_evtchn_irq(v); > > While this would match what hvm_set_callback_via() does, see my post- > commit-message remark suggesting to key this to evtchn_upcall_pending. > Tying the call to the vCPU being online looks pretty arbitrary to me, > the more that this would continue to be > - racy with the vCPU coming online, and perhaps already being past > VCPUOP_initialise - after all VCPUOP_up is little more than clearing > VPF_down, If the OS attempts to setup the callback at the same time the CPU is being brought online all bets are off I think, and the OS gets to keep the pieces. > - problematic wrt evtchn_upcall_pending, once set, preventing event > injection later on. > As you may have inferred already, I'm inclined to suggest to drop the > the is_vcpu_online() check from hvm_set_callback_via(). > > One related question here is whether vlapic_do_init() shouldn't have > the non-architectural side effect of clearing evtchn_upcall_pending. > While this again violates the principle of the hypervisor only ever > setting that bit, it would deal with the risk of no further event > injection once the flag is set, considering that vlapic_do_init() > clears IRR (and ISR). That would seem sensible to me, and was kind of what I was suggesting in: https://lore.kernel.org/xen-devel/Y3eO0bMKRPYJc2yQ@Air-de-Roger/ Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |