[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling




> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx]
> Sent: Wednesday, September 23, 2015 11:26 PM
> To: Wu, Feng; Jan Beulich
> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; Tian, Kevin;
> xen-devel@xxxxxxxxxxxxx; Keir Fraser
> Subject: Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core 
> logic
> handling
> 
> On 09/23/2015 01:35 PM, Wu, Feng wrote:
> >
> >
> >> -----Original Message-----
> >> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx]
> >> Sent: Wednesday, September 23, 2015 5:44 PM
> >> To: Jan Beulich; Wu, Feng
> >> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; Tian, Kevin;
> >> xen-devel@xxxxxxxxxxxxx; Keir Fraser
> >> Subject: Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core
> logic
> >> handling
> >>
> >> On 09/22/2015 03:01 PM, Jan Beulich wrote:
> >>>>>> On 22.09.15 at 15:40, <feng.wu@xxxxxxxxx> wrote:
> >>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >>>>> Sent: Tuesday, September 22, 2015 5:00 PM
> >>>>> To: Wu, Feng
> >>>>> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; George Dunlap;
> Tian,
> >>>> Kevin;
> >>>>> xen-devel@xxxxxxxxxxxxx; Keir Fraser
> >>>>> Subject: RE: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt
> core
> >> logic
> >>>>> handling
> >>>>>
> >>>>>>>> On 22.09.15 at 09:19, <feng.wu@xxxxxxxxx> wrote:
> >>>>>> However, I do find some issues with my proposal above, see below:
> >>>>>>
> >>>>>> 1. Set _VPF_blocked
> >>>>>> 2. ret = arch_block()
> >>>>>> 3. if ( ret || local_events_need_delivery() )
> >>>>>>        clear_bit(_VPF_blocked, &v->pause_flags);
> >>>>>>
> >>>>>> After step #2, if ret == false, that means, we successfully changed the
> PI
> >>>>>> descriptor in arch_block(), if local_events_need_delivery() returns 
> >>>>>> true,
> >>>>>> _VPF_blocked is cleared. After that, external interrupt come in, hence
> >>>>>> pi_wakeup_interrupt() --> vcpu_unblock(), but _VPF_blocked is cleared,
> >>>>>> so vcpu_unblock() does nothing, so the vCPU's PI state is incorrect.
> >>>>>>
> >>>>>> One possible solution for it is:
> >>>>>> - Disable the interrupts before the check in step #3 above
> >>>>>> - if local_events_need_delivery() returns true, undo all the operations
> >>>>>>  done in arch_block()
> >>>>>> - Enable interrupts after _VPF_blocked gets cleared.
> >>>>>
> >>>>> Couldn't this be taken care of by, if necessary, adjusting PI state
> >>>>> in vmx_do_resume()?
> >>>>
> >>>> What do you mean here? Could you please elaborate? Thanks!
> >>>
> >>> From the discussion so far I understand that all you're after is that
> >>> the PI descriptor is correct for the period of time while the guest
> >>> vCPU is actually executing in guest context. If that's a correct
> >>> understanding of mine, then setting the vector and flags back to
> >>> what's needed while the guest is running would be sufficient to be
> >>> done right before entering the guest, i.e. in vmx_do_resume().
> >>
> >> Along those lines, is setting the SN and NDST relatively expensive?
> >>
> >> If they are, then of course switching them in __context_switch() makes
> >> sense.
> >>
> >> But if setting them is fairly cheap, then we could just clear SN on
> >> every vmexit, and set SN and NDST on every vmentry, couldn't we?
> >
> > Do you mean _set_ SN (Suppress Notification) on vmexit and _clear_
> > SN on vmentry? I think this might be an alternative.
> 
> Er, yes, that's what I meant. :-)  Sorry, getting the set / clear
> "suppress notification" mixed up with the normal sti / cli (set / clear
> interrupts).
> 
> >
> >> Then we wouldn't need hooks on the context switch path at all (which are
> only
> >> there to catch running -> runnable and runnable -> running) -- we could
> >> just have the block/unblock hooks (to change NV and add / remove things
> >> from the list).
> >
> > We still need to _clear_ SN when vCPU is being blocked.
> 
> Yes, thanks for the reminder.
> 
> >> Setting NDST on vmentry also has the advantage of being more robust --
> >> you don't have to carefully think about cpu migration and all that; you
> >> simply set it right when you're about to actually use it.
> >
> > In the current solution, we set the NDST in the end of vmx_ctxt_switch_to(),
> > it also doesn't suffer from cpu migration, right?
> 
> It works, yes.
> 
> >> Looking at your comment in pi_notification_interrupt() -- I take it that
> >> "pending interrupt in PIRR will be synced to vIRR" somewhere in the call
> >> to vmx_intr_assist()?
> >
> > Yes.
> > vmx_intr_assist() --> hvm_vcpu_has_pending_irq()
> > --> vlapic_has_pending_irq() --> vlapic_find_highest_irr()
> > --> hvm_funcs.sync_pir_to_irr()
> >
> >> So if we were to set NDST and enable SN before
> >> that call, we should be able to set VCPU_KICK if we get an interrupt
> >> between vmx_intr_assist() and the actual vmentry as necessary.
> >
> > But if the interrupt comes in between last vmexit and enabling SN here,
> > it cannot be injected to guest during this vmentry. This will affect the
> > interrupt latency, each PI happening during vCPU is in root-mode needs
> > to be injected to the guest during the next vmentry.
> 
> I don't understand this.  I'm proposing this:
> 
> (actual vmexit)
> ...[1]
> set SN
> ...[2]
> (Hypervisor does stuff)
> vmx_do_vmentry
> set NDST
> clear SN
> ...[3]
> vmx_intr_assist()
> ...[4]
> cli
> (check for softirqs)
> ...[5]
> (actual vmentry)
> 
> Here is what happens if an interrupt is generated at the various [N]:
> 
> [1]: posted_interrupt_vector delivered to hypervisor.  VCPU_KICK set,
> but unnecessarily, since the pending interrupt will already be picked up
> by vmx_intr_assist().
> 
> [2]: No interrupt delivered to hypervisor.  Pending interrupt will be
> picked up by vmx_intr_assist().
> 
> [3]: posted_interrupt_vector delivered to hypervisor.  VCPU_KICK set,
> but unnecessarily, since pending interrupt will already be picked up by
> vmx_intr_assist().
> 
> [4]: posted_interrupt_vector delivered to hypervisor.  VCPU_KICK set
> necessarily this time.  check for softirqs causes it to retry vmentry,
> so vmx_intr_assist() can pick up the pending interrupt.
> 
> [5]: interrupt not delivered until interrupts are clear in the guest
> context, at which point it will be delivered directly to the guest.
> 
> This seems to me to have the same properties the solution you propose,
> except that in your solution, [2] behaves identically to [1] and [3].
> Have I missed something?

No, I think you covered everything! :) Nice analysis!!

> 
> > I think the current solution we have discussed these days is very clear,
> > and I am about to implement it. Does the above method have obvious
> > advantage compare to what we discussed so far?
> 
> This seems to me to be a cleaner design.  It removes the need to modify
> any code on context-switch path.  It moves modification of NDST and most
> modifications of SN closer to where I think they logically go.  It
> reduces the number of unnecessary PI interrupts delivered to the
> hypervisor (by suppressing notifications most of the time spend in the
> hypervisor), and automatically deals with the "spurious interrupts
> during tasklet execution / vcpu offline lazy context switch" issue we
> were talking about with the other thread.

One issue is the number of vmexits is far far bigger than the number of
context switch. I test it for a quite short time and it shows there are
2910043 vmexits and 71733 context switch (only count the number in
__context_switch() since we only change the PI state in this function).
If we change the PI state in each vmexit/vmentry, I am afraid this will
hurt the performance.

Thanks,
Feng

> 
> On the other hand, it does add extra hooks on the vmentry/vmexit paths;
> but they seem very similar to me to the kind of hooks which are already
> there.
> 
> So, at the moment my *advice* is to look into setting SN / NDST on
> vmexit/vmentry, and only having hooks at block/unblock.
> 
> However, the __context_switch() path is also OK with me, if Jan and
> Dario are happy.
> 
> Jan / Dario, do you guys have any opinions / preferences?
> 
>  -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.