Xen project Mailing List

Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling

To: George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

Date: Thu, 24 Sep 2015 01:50:01 +0000

Accept-language: en-US

Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, "Wu, Feng" <feng.wu@xxxxxxxxx>

Delivery-date: Thu, 24 Sep 2015 01:50:19 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHQ9hQrqcvU+IKEMUKyI59gyehsPJ5K4crQ

Thread-topic: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling

> -----Original Message----- > From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx] > Sent: Wednesday, September 23, 2015 11:26 PM > To: Wu, Feng; Jan Beulich > Cc: Andrew Cooper; Dario Faggioli; George Dunlap; Tian, Kevin; > xen-devel@xxxxxxxxxxxxx; Keir Fraser > Subject: Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core > logic > handling > > On 09/23/2015 01:35 PM, Wu, Feng wrote: > > > > > >> -----Original Message----- > >> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx] > >> Sent: Wednesday, September 23, 2015 5:44 PM > >> To: Jan Beulich; Wu, Feng > >> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; Tian, Kevin; > >> xen-devel@xxxxxxxxxxxxx; Keir Fraser > >> Subject: Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core > logic > >> handling > >> > >> On 09/22/2015 03:01 PM, Jan Beulich wrote: > >>>>>> On 22.09.15 at 15:40, <feng.wu@xxxxxxxxx> wrote: > >>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > >>>>> Sent: Tuesday, September 22, 2015 5:00 PM > >>>>> To: Wu, Feng > >>>>> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; George Dunlap; > Tian, > >>>> Kevin; > >>>>> xen-devel@xxxxxxxxxxxxx; Keir Fraser > >>>>> Subject: RE: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt > core > >> logic > >>>>> handling > >>>>> > >>>>>>>> On 22.09.15 at 09:19, <feng.wu@xxxxxxxxx> wrote: > >>>>>> However, I do find some issues with my proposal above, see below: > >>>>>> > >>>>>> 1. Set _VPF_blocked > >>>>>> 2. ret = arch_block() > >>>>>> 3. if ( ret || local_events_need_delivery() ) > >>>>>> clear_bit(_VPF_blocked, &v->pause_flags); > >>>>>> > >>>>>> After step #2, if ret == false, that means, we successfully changed the > PI > >>>>>> descriptor in arch_block(), if local_events_need_delivery() returns > >>>>>> true, > >>>>>> _VPF_blocked is cleared. After that, external interrupt come in, hence > >>>>>> pi_wakeup_interrupt() --> vcpu_unblock(), but _VPF_blocked is cleared, > >>>>>> so vcpu_unblock() does nothing, so the vCPU's PI state is incorrect. > >>>>>> > >>>>>> One possible solution for it is: > >>>>>> - Disable the interrupts before the check in step #3 above > >>>>>> - if local_events_need_delivery() returns true, undo all the operations > >>>>>> done in arch_block() > >>>>>> - Enable interrupts after _VPF_blocked gets cleared. > >>>>> > >>>>> Couldn't this be taken care of by, if necessary, adjusting PI state > >>>>> in vmx_do_resume()? > >>>> > >>>> What do you mean here? Could you please elaborate? Thanks! > >>> > >>> From the discussion so far I understand that all you're after is that > >>> the PI descriptor is correct for the period of time while the guest > >>> vCPU is actually executing in guest context. If that's a correct > >>> understanding of mine, then setting the vector and flags back to > >>> what's needed while the guest is running would be sufficient to be > >>> done right before entering the guest, i.e. in vmx_do_resume(). > >> > >> Along those lines, is setting the SN and NDST relatively expensive? > >> > >> If they are, then of course switching them in __context_switch() makes > >> sense. > >> > >> But if setting them is fairly cheap, then we could just clear SN on > >> every vmexit, and set SN and NDST on every vmentry, couldn't we? > > > > Do you mean _set_ SN (Suppress Notification) on vmexit and _clear_ > > SN on vmentry? I think this might be an alternative. > > Er, yes, that's what I meant. :-) Sorry, getting the set / clear > "suppress notification" mixed up with the normal sti / cli (set / clear > interrupts). > > > > >> Then we wouldn't need hooks on the context switch path at all (which are > only > >> there to catch running -> runnable and runnable -> running) -- we could > >> just have the block/unblock hooks (to change NV and add / remove things > >> from the list). > > > > We still need to _clear_ SN when vCPU is being blocked. > > Yes, thanks for the reminder. > > >> Setting NDST on vmentry also has the advantage of being more robust -- > >> you don't have to carefully think about cpu migration and all that; you > >> simply set it right when you're about to actually use it. > > > > In the current solution, we set the NDST in the end of vmx_ctxt_switch_to(), > > it also doesn't suffer from cpu migration, right? > > It works, yes. > > >> Looking at your comment in pi_notification_interrupt() -- I take it that > >> "pending interrupt in PIRR will be synced to vIRR" somewhere in the call > >> to vmx_intr_assist()? > > > > Yes. > > vmx_intr_assist() --> hvm_vcpu_has_pending_irq() > > --> vlapic_has_pending_irq() --> vlapic_find_highest_irr() > > --> hvm_funcs.sync_pir_to_irr() > > > >> So if we were to set NDST and enable SN before > >> that call, we should be able to set VCPU_KICK if we get an interrupt > >> between vmx_intr_assist() and the actual vmentry as necessary. > > > > But if the interrupt comes in between last vmexit and enabling SN here, > > it cannot be injected to guest during this vmentry. This will affect the > > interrupt latency, each PI happening during vCPU is in root-mode needs > > to be injected to the guest during the next vmentry. > > I don't understand this. I'm proposing this: > > (actual vmexit) > ...[1] > set SN > ...[2] > (Hypervisor does stuff) > vmx_do_vmentry > set NDST > clear SN > ...[3] > vmx_intr_assist() > ...[4] > cli > (check for softirqs) > ...[5] > (actual vmentry) > > Here is what happens if an interrupt is generated at the various [N]: > > [1]: posted_interrupt_vector delivered to hypervisor. VCPU_KICK set, > but unnecessarily, since the pending interrupt will already be picked up > by vmx_intr_assist(). > > [2]: No interrupt delivered to hypervisor. Pending interrupt will be > picked up by vmx_intr_assist(). > > [3]: posted_interrupt_vector delivered to hypervisor. VCPU_KICK set, > but unnecessarily, since pending interrupt will already be picked up by > vmx_intr_assist(). > > [4]: posted_interrupt_vector delivered to hypervisor. VCPU_KICK set > necessarily this time. check for softirqs causes it to retry vmentry, > so vmx_intr_assist() can pick up the pending interrupt. > > [5]: interrupt not delivered until interrupts are clear in the guest > context, at which point it will be delivered directly to the guest. > > This seems to me to have the same properties the solution you propose, > except that in your solution, [2] behaves identically to [1] and [3]. > Have I missed something? No, I think you covered everything! :) Nice analysis!! > > > I think the current solution we have discussed these days is very clear, > > and I am about to implement it. Does the above method have obvious > > advantage compare to what we discussed so far? > > This seems to me to be a cleaner design. It removes the need to modify > any code on context-switch path. It moves modification of NDST and most > modifications of SN closer to where I think they logically go. It > reduces the number of unnecessary PI interrupts delivered to the > hypervisor (by suppressing notifications most of the time spend in the > hypervisor), and automatically deals with the "spurious interrupts > during tasklet execution / vcpu offline lazy context switch" issue we > were talking about with the other thread. One issue is the number of vmexits is far far bigger than the number of context switch. I test it for a quite short time and it shows there are 2910043 vmexits and 71733 context switch (only count the number in __context_switch() since we only change the PI state in this function). If we change the PI state in each vmexit/vmentry, I am afraid this will hurt the performance. Thanks, Feng > > On the other hand, it does add extra hooks on the vmentry/vmexit paths; > but they seem very similar to me to the kind of hooks which are already > there. > > So, at the moment my *advice* is to look into setting SN / NDST on > vmexit/vmentry, and only having hooks at block/unblock. > > However, the __context_switch() path is also OK with me, if Jan and > Dario are happy. > > Jan / Dario, do you guys have any opinions / preferences? > > -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.