[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv



On Tue, Mar 24, 2020 at 10:11:15AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > Sent: Tuesday, March 24, 2020 5:51 PM
> > 
> > On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > > > From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> > > > Sent: Saturday, March 21, 2020 3:08 AM
> > > >
> > > > The current usage of nvmx_update_apicv is not clear: it is deeply
> > > > intertwined with the Ack interrupt on exit VMEXIT control.
> > > >
> > > > The code in nvmx_update_apicv should update the SVI (in service
> > interrupt)
> > > > field of the guest interrupt status only when the Ack interrupt on
> > > > exit is set, as it is used to record that the interrupt being
> > > > serviced is signaled in a vmcs field, and hence hasn't been injected
> > > > as on native. It's important to record the current in service
> > > > interrupt on the guest interrupt status field, or else further
> > > > interrupts won't respect the priority of the in service one.
> > > >
> > > > While clarifying the usage make sure that the SVI is only updated when
> > > > Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> > > > else a guest not using the Ack on exit feature would loose interrupts as
> > > > they would be signaled as being in service on the guest interrupt
> > > > status field but won't actually be recorded on the interrupt info vmcs
> > > > field, neither injected in any other way.
> > >
> > > It is insufficient. You also need to update RVI to enable virtual 
> > > injection
> > > when Ack on exit is cleared.
> > 
> > But RVI should be updated in vmx_intr_assist in that case, since
> > nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
> > handled normally.
> 
> As we discussed before, vmx_intr_assist is invoked before nvmx_switch_guest.
> 
> It is incorrectly to update RVI at that time since it might be still vmcs02 
> being 
> active (if no pending softirq to make it invoked again).
> 
> Also nvmx_intr_intercept does intercept Ack-on-exit=0 case:
> 
>         if ( intack.source == hvm_intsrc_pic ||
>                  intack.source == hvm_intsrc_lapic )
>         {
>             vmx_inject_extint(intack.vector, intack.source);
> 
>             ctrl = get_vvmcs(v, VM_EXIT_CONTROLS);
>             if ( ctrl & VM_EXIT_ACK_INTR_ON_EXIT )
>             {
>                 /* for now, duplicate the ack path in vmx_intr_assist */
>                 hvm_vcpu_ack_pending_irq(v, intack);
>                 pt_intr_post(v, intack);
> 
>                 intack = hvm_vcpu_has_pending_irq(v);
>                 if ( unlikely(intack.source != hvm_intsrc_none) )
>                     vmx_enable_intr_window(v, intack);
>             }
>             else if ( !cpu_has_vmx_virtual_intr_delivery )
>                 vmx_enable_intr_window(v, intack);
> 
>             return 1; <<<<<<<<

Right, I always get confused by the switch happening in the vmentry
path.

That only happens when the vcpu is in nested mode
(nestedhvm_vcpu_in_guestmode == true). That would be the case before a
vmexit, at least for the first call to vmx_intr_assist if there are
pending softirqs.

Note that if there are pending softirqs the second time
nvmx_intr_intercept will return 0.

>         }
> 
> > 
> > > >
> > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > > > ---
> > > >  xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
> > > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> > b/xen/arch/x86/hvm/vmx/vvmx.c
> > > > index 1b8461ba30..180d01e385 100644
> > > > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > > > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > > > @@ -1383,7 +1383,7 @@ static void (struct vcpu *v)
> > > >  {
> > > >      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> > > >      unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
> > > > -    uint32_t intr_info = nvmx->intr.intr_info;
> > > > +    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);
> > >
> > > well, above reminds me an interesting question. Combining last and this
> > > patch, we'd see essentially that it goes back to the state before f96e1469
> > > (at least when Ack on exit is true).
> > 
> > Well, patch 1/3 is a revert of f96e1469, so just reverting f96e1469
> > gets us to that state.
> 
> you are right. I just wanted to point out that this patch alone doesn't
> do any real fix for ack-on-exit=1 case. It just makes sure that ack-on-exit=0
> is skipped from that function. So it won't be the one fixing your previous
> problem. 😊

Yes, that's correct.

> > 
> > This patch is an attempt to clarify that nvmx_update_apicv is closely
> > related to the Ack on exit feature, as it modifies SVI in order to
> > signal the vector currently being serviced by the EXIT_INTR_INFO vmcs
> > field. This was not obvious to me, as at first sight I assumed
> > nvmx_update_apicv was actually injecting that vector into the guest.
> > 
> > > iirc, that commit was introduced to enable
> > > nested x2apic with apicv, and your very first version even just removed
> > > the whole nvmx_update_apicv. Then now with the new reverted logic,
> > > are you still suffering x2apic problem? If not, does it imply the real fix
> > > is actually coming from patch 3/3 for eoi bitmap update?
> > 
> > Yes, patch 3/3 is the one that fixed the issue. Note however that
> > strangely enough removing the call to nvmx_update_apicv (as my first
> > attempt to solve the issue) did fix it on one of my boxes. It depends
> > a lot on the available vmx features AFAICT.
> 
> Did you confirm that with 3/3 alone can fix that issue? Just want to make
> sure the real gain of each patch, so we can reflect it in the commit msg
> in updated version.

Yes, the patch that actually fixes the issue in the box I've been
testing with is 3/3.

Xen will always use the Ack on exit feature, I currently have no way
to test whether not using Ack on exit works at all.

> > 
> > > >
> > > >      if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> > > >           nvmx->intr.source == hvm_intsrc_lapic &&
> > > > @@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu *v)
> > > >          ppr = vlapic_set_ppr(vlapic);
> > > >          WARN_ON((ppr & 0xf0) != (vector & 0xf0));
> > > >
> > > > +        /*
> > > > +         * SVI must be updated when the interrupt has been signaled 
> > > > using
> > the
> > > > +         * Ack on exit feature, or else the currently in-service 
> > > > interrupt
> > > > +         * won't be respected.
> > > > +         *
> > > > +         * Note that this is specific to the fact that when doing a 
> > > > VMEXIT an
> > > > +         * interrupt might get delivered using the interrupt info vmcs 
> > > > field
> > > > +         * instead of being injected normally.
> > > > +         */
> > >
> > > I'm not sure mentioning SVI specifically here is necessary, since all 
> > > steps
> > > here are required - updating iir, isr, rvi, svi, ppr, etc. It is just an 
> > > emulation
> > > of updating virtual APIC state as if a virtual interrupt delivery happens.
> > 
> > Hm, it was hard for me to figure out that SVI is modified here in
> > order to signal that the Ack on exit vector is currently in service,
> > as opposed to being injected using the virtual interrupt delivery
> > mechanism.
> > 
> > I just wanted to clarify that the purpose of this function is not to
> > inject the vector in intr_info, but rather to signal that such vector
> > has already been injected using a different mechanism.
> > 
> > I'm happy to reword this, but IMO it should be clear that the purpose
> > of the function is not so much to inject an interrupt, but to update
> > the virtual interrupt delivery field in order to signal that an
> > interrupt has been injected using a different mechanism.
> > 
> 
> reading it again I feel possibly fine to put the comment there. But
> I disagree the statement above. The purpose of this function is indeed 
> for injecting an interrupt. Both RVI and SVI are additional requirements 
> when virtual APIC page is in-use while virtual interrupt delivery is not 
> used, i.e. when ack-on-exit is set.

Right, so if I understand this correctly:

 - SVI: must be updated when Ack on exit is used, to signal the
   current in service interrupt which has been injected using a
   mechanism different than virtual interrupts.

 - RVI: should always be updated if there are pending interrupts to
   be delivered. Note that the interrupt signaled in the INTR_INFO
   field if Ack on exit is enabled is not considered pending anymore.

So I think RVI should always be updated, regardless of whether Ack on
exit is used or not. Do you agree?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.