[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: Wednesday, March 04, 2015 11:19 PM > To: Wu, Feng > Cc: Tian, Kevin; Zhang, Yang Z; xen-devel@xxxxxxxxxxxxx > Subject: Re: VT-d Posted-interrupt (PI) design for XEN > > >>> On 04.03.15 at 14:30, <feng.wu@xxxxxxxxx> wrote: > > - Introduce a new global vector which is used to wake up the HLT'ed vCPU. > > Currently, there is a global vector 'posted_intr_vector', which is used as > > the > > global notification vector for all vCPUs in the system. This vector is > > stored in > > VMCS and CPU considers it as a special vector, uses it to notify the related > > pCPU when an interrupt is recorded in the posted-interrupt descriptor. > > > > After having VT-d PI, VT-d engine can issue notification event when the > > assigned devices issue interrupts. We need add a new global vector to > > wakeup the HLT'ed vCPU, please refer to the following scenario for the > > usage of this new global vector: > > > > 1. vCPU0 is running on pCPU0 > > 2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0 > > 3. An external interrupt from an assigned device occurs for vCPU0, if we > > still use 'posted_intr_vector' as the notification vector for vCPU0, the > > notification event for vCPU0 (the event will go to pCPU1) will be consumed > > by vCPU1 incorrectly. The worst case is that vCPU0 will never be woken up > > again since the wakeup event for it is always consumed by other vCPUs > > incorrectly. So we need introduce another global vector, naming > > 'pi_wakeup_vector' > > to wake up the HTL'ed vCPU. > > I'm afraid you describe a particular scenario here, but I don't see > how this is related to the introduction of another global vector: > Either the current (global) vector is sufficient, or another global > vector also can't solve your problem. I'm sure I'm missing something > here, so please be explicit. > In fact, the new global vector is used for the above scenario. Let me explain this a bit more: After having VT-d PI, when an external interrupt from an assigned device happens, here is the hardware processing flow: 1. Interrupts happen. 2. Find the associated IRTE. 3. Find the destination vCPU from IRTE (from Posted-interrupt descriptor address) 4. Sync the interrupt (stored in IRTE as 'virtual vector') to PIRR fields in Posted-interrupt descriptor. 5. If needed (Please refer to the VT-d Spec about the condition of issuing Notification Event), issue notification event to the destination CPU which is store in posted-interrupt descriptor as 'NDST' Back to the above scenario: 1. vCPU0 is running in pCPU0, and the 'NDST' filed of vCPU0's posted-interrupt descriptor is pCPU0 2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0. 3. An external interrupt from an assigned device happens, the destination of this interrupt will be determined as above flow (IRTE --> posted-interrupt descriptor address/vCPU --> notification event to 'NDST'), If this external interrupt is for vCPU0, the notification event will be delivered to pCPU0 since the 'NDST' field of vCPU0's posted-interrupt descriptor is pCPU0. if we use the current (global) vector for the notification event for vCPU0 in the above case, since the current global vector (notification vector) is a particular vector to CPU, vCPU1 will consume it while vCPU1 is currently running on pCPU0, so we failed to wake up the HLT'ed vCPU0. please refer to Section 29.6 in the Intel SDM about how CPU handles this particular vector: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf After introducing a new global vector naming 'pi_wakeup_vector', before vCPU is being HLT'ed, we set The 'NV' filed (Notification Vector) in the vCPU's posted-interrupt descriptor to 'pi_wakeup_vector', and this is a normal vector to CPU and CPU will not do special things for it (different from the current global vector). In the handler of this vector, we can wake up the HLT'ed vCPU. > > - Update posted-interrupt descriptor during vCPU scheduling > > The basic idea here is: > > 1. When vCPU's state is RUNSTATE_running, > > - Set 'NV' to 'posted_intr_vector'. > > - Clear 'SN' to accept posted-interrupts. > > - Set 'NDST' to the pCPU on which the vCPU will be running. > >[...] > > This is pretty hard to read without knowing what the abbreviations > actually stand for, and suggesting to hunt for them in the spec isn't > very reader friendly either. Please explain these fields, at the very > least by way of comments on the structure fields presented earlier. > There are some changes to IRTE and posted-interrupt descriptor after VT-d PI is introduced: IRTE: Posted-interrupt Descriptor Address: the address of the posted-interrupt descriptor Virtual Vector: the guest vector of the interrupt URG: indicates if the interrupt is urgent Posted-interrupt descriptor: The Posted Interrupt Descriptor hosts the following fields: Posted Interrupt Request (PIR): Provide storage for posting (recording) interrupts (one bit per vector, for up to 256 vectors). Outstanding Notification (ON): Indicate if there is a notification event outstanding (not processed by processor or software) for this Posted Interrupt Descriptor. When this field is 0, hardware modifies it from 0 to 1 when generating a notification event, and the entity receiving the notification event (processor or software) resets it as part of posted interrupt processing. Suppress Notification (SN): Indicate if a notification event is to be suppressed (not generated) for non-urgent interrupt requests (interrupts processed through an IRTE with URG=0). Notification Vector (NV): Specify the vector for notification event (interrupt). Notification Destination (NDST): Specify the physical APIC-ID of the destination logical processor for the notification event. > > On Xen side, what is your opinion about support lowest-priority interrupts > > for VT-d PI? > > I certainly think (as with every other virtualized piece of hardware) > that hardware behavior should be emulated as closely as possible. > I.e. yes, we should have it eventually. As to the two stage approach > mentioned for KVM - I've grown reservations against Intel people > making promises towards future implementation of something, i.e. > I'm kind of hesitant to agree to such an implementation model. Yet > you're to contribute the patches, and I'm surely not planning to veto > a stage-1-only implementation as it would be an improvement anyway. > Well, I am okay with doing a full implementation for lowest-priority. KVM people trends to do simple things at the first stage of hardware enabling, if you don't like do it this way, I will skip the stage 1 above and implement the full solution directly on XEN side. Thanks, Feng > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |