[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling



On 28/01/16 05:12, Feng Wu wrote:
> This is the core logic handling for VT-d posted-interrupts. Basically it
> deals with how and when to update posted-interrupts during the following
> scenarios:
> - vCPU is preempted
> - vCPU is slept
> - vCPU is blocked
> 
> When vCPU is preempted/slept, we update the posted-interrupts during
> scheduling by introducing two new architecutral scheduler hooks:
> vmx_pi_switch_from() and vmx_pi_switch_to(). When vCPU is blocked, we
> introduce a new architectural hook: arch_vcpu_block() to update
> posted-interrupts descriptor.
> 
> Besides that, before VM-entry, we will make sure the 'NV' filed is set
> to 'posted_intr_vector' and the vCPU is not in any blocking lists, which
> is needed when vCPU is running in non-root mode. The reason we do this check
> is because we change the posted-interrupts descriptor in vcpu_block(),
> however, we don't change it back in vcpu_unblock() or when vcpu_block()
> directly returns due to event delivery (in fact, we don't need to do it
> in the two places, that is why we do it before VM-Entry).
> 
> When we handle the lazy context switch for the following two scenarios:
> - Preempted by a tasklet, which uses in an idle context.
> - the prev vcpu is in offline and no new available vcpus in run queue.
> We don't change the 'SN' bit in posted-interrupt descriptor, this
> may incur spurious PI notification events, but since PI notification
> event is only sent when 'ON' is clear, and once the PI notificatoin
> is sent, ON is set by hardware, hence no more notification events
> before 'ON' is clear. Besides that, spurious PI notification events are
> going to happen from time to time in Xen hypervisor, such as, when
> guests trap to Xen and PI notification event happens, there is
> nothing Xen actually needs to do about it, the interrupts will be
> delivered to guest atht the next time we do a VMENTRY.
> 
> CC: Keir Fraser <keir@xxxxxxx>
> CC: Jan Beulich <jbeulich@xxxxxxxx>
> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> CC: Kevin Tian <kevin.tian@xxxxxxxxx>
> CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> CC: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> Suggested-by: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> Suggested-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> Suggested-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> Suggested-by: Jan Beulich <jbeulich@xxxxxxxx>
> Signed-off-by: Feng Wu <feng.wu@xxxxxxxxx>

Feng,

Thanks for your work on this.

Coming back to this thread after 5 months, what strikes me first of all
is that it would be good to have a comment somewhere laying out exactly
all the things that need to change for the different runstates with
posted interrupts, so that someone later trying to change things has an
overview of what invariants need to be kept.

What do you think about adding the following comment somewhere near the
PI callbacks? (Corrected for accuracy of course.)

---
To handle posted interrupts correctly, we need to set the following state:

* The PI notification vector (NV)
* The PI notification destination processor (NDST)
* The PI "suppress notification" bit (SN)
* The vcpu pi "blocked" list

If a VM is currently running, we want the PI delivered to the guest vcpu
on the proper pcpu (NDST = v->processor, SN clear).

If the vm is blocked, we want the PI delivered to Xen so that it can
wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).

If the VM is currently either preempted or offline (i.e., not running
because of some reason other than blocking waiting for an interrupt),
there's nothing Xen can do -- we want the interrupt pending bit set in
the guest, but we don't want to bother Xen with an interrupt (SN clear).

There's a brief window of time between vmx_intr_assist() and checking
softirqs where if an interrupt comes in it may be lost; so we need Xen
to get an interrupt and raise a softirq so that it will go through the
vmx_intr_assist() path again (SN clear, NV = posted_interrupt).

The way we implement this now is by looking at what needs to happen on
the following runstate transitions:

A: runnable -> running
 - SN = 0
 - NDST = v->processor
B: running -> runnable
 - SN = 1
C: running -> blocked
 - NV = pi_wakeup_vector
 - Add vcpu to blocked list
D: blocked -> runnable
- NV = posted_intr_vector
- Take vcpu off blocked list

For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to} paths.

For transition C, we add a new arch hook, arch_vcpu_block(), which is
called from vcpu_block() and vcpu_do_poll().

For transition D, rather than add an extra arch hook on vcpu_wake, we
add a hook on the vmentry path which checks to see if either of the two
actions need to be taken.

These hooks only need to be called when the domain in question actually
has a physical device assigned to it, so we set and clear the callbacks
as appropriate when device assignment changes.
---

Is that about right?

If we had this, I don't think we'd need the comments in
vmx_pi_switch_{from,to}.

Laying things out this way, it also makes me wonder whether it might not
be more sensible / robust to set NDST on the vmentry path in the same
way we set NV.  But at this point it's just bikeshedding, so feel free
to leave it where it is.

Other than that -- and the details about placement of the ASSERT and the
hook reassignment -- it all looks good to me.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.