Xen project Mailing List

RE: [Xen-devel] MSI and VT-d interrupt remapping

To: "Espen Skoglund" <espen.skoglund@xxxxxxxxxxxxx>

From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>

Date: Wed, 26 Mar 2008 22:46:33 +0800

Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, "Shan, Haitao" <haitao.shan@xxxxxxxxx>

Delivery-date: Wed, 26 Mar 2008 07:49:22 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: AciOlwFMpdVok2t+Rm6c0iPU/fJRywAuCu5Q

Thread-topic: [Xen-devel] MSI and VT-d interrupt remapping

Espen Skoglund <mailto:espen.skoglund@xxxxxxxxxxxxx> wrote: > [Yunhong Jiang] >> xen-devel-bounces@xxxxxxxxxxxxxxxxxxx <> wrote: >>> You're right in that Linux does not currently support this. You >>> can, however, allocate multiple interrupts using MSI-X. Anyhow, I >>> was not envisioning this feature being used directly for >>> passthrough device access. Rather, I was considering the case >>> where a device could be configured to communicate data directly >>> into a VM (e.g., using multi-queue NICs) and deliver the interrupt >>> to the appropriate VM. In this case the frontend in the guest >>> would not need to see a multi-message MSI device, only the backend >>> in dom0/the driver domain would need to be made aware of it. > >> Although I don't know if any device has such usage model (Intel's >> VMDq is using MSI-X ), but yes, your usage model will be helpful. >> To achive this, maybe we need change the protocol between pci >> backend and pci frontend, in fact, maybe the >> pci_enable_msi/pci_enable_msix can be commbind, with a flag to >> determin if the vector should be continous or not. > > This is similar to my initial idea as well. In addition to being > contigous the multi-message MSI request would also need to allocate > vectors that are properly aligned. Yes, but don't think we need add the implementation now. We can change the xen_pci_op to accomondate this requirement, otherwise, this will cause more difference with upstream Linux. (Maybe the hypercall need changed for this requirement also). As for set_irq_affinity, I think it is a general issue, not MSI related, we can follow up on it continously. > >> One thing left is, how can the driver domain bind the vector to the >> frontend VM. Some sanity check mechanism should be added. > > Well, there exists a domctl for modifying the permissions of a pirq. > This could be used to grant pirq access to a frontend domain. Not sure if > this is sufficient. > > Also, as discussed in my previous reply dom0 may need the ability to > reset the affinity of an irq when migrating the destination vcpu. > Further, a pirq is now always bound to vcpu[0] of a domain (in > evtchn_bind_pirq). There is clearly some room for improvement and more > flexibility here. > > Not sure what the best solution is. One option is to allow a guest to > re-bind a pirq to set its affinity, and have such expliticly set > affinities be automatically updated when the associated vcpu is > migrated. Another option is to create unbound ports in a guest domain > and let a privileged domain bind pirqs to those port. The privileged > domain should then also be allowed to later modify the destination > vcpu and set the affinity of the bound pirq. > > >> BTW, can you tell which device may use this feature? I'm a bit >> interesting on this. > > I must confess that I do not know of any device that currently use > this feature (perhaps Solarflare or NetXen devices have support for > it), and the whole connection with VT-d interreupt remapping is as of > now purely academic anyway due to the lack of chipsets with the apropriate > feature. > > However, the whole issue of binding multiple pirqs of a device to > different guest domains remains the same even if using MSI-X. > Multi-message MSI devices only/mostly add some additional restrictions > upon allocating interrupt vectors. > > >>>>> I do not think explicitly specifying destination APIC upon >>>>> allocation is the best idea. Setting the affinity upon binding >>>>> the interrupt like it's done today seems like a better approach. >>>>> This leaves us with dealing with the vectors. >>> >>>> But what should happen when the vcpu is migrated to another >>>> physical cpu? I'm not sure the cost to program the interrupt >>>> remapping table, otherwise, that is a good choice to achieveh the >>>> affinity. >>> >>> As you've already said, the interrupt affinity is only set when a >>> pirq is bound. The interrupt routing is not redirected if the vcpu >>> it's bound to migrates to another physical cpu. This can (should?) >>> be changed in the future so that the affinity is either set >>> implicitly when migrating the vcpu, or explictily with a rebind >>> call by dom0. In any case the affinity would be reset by the >>> set_affinity method. > >> Yes, I remember Keir suggested to use interrupt remapping table in >> vtd to achieve this, not sure that is still ok. > > Relying on the VT-d interrupt remapping table would rule out any Intel > chipset on the market today, and also the equivalent solution (if any) used > by AMD and others. > > It seems better to update the IOAPIC entry or MSI capability structure > directly when redirecting the interrupt, and let io_apic_write() or > the equivalent function for MSI rewrite the interrupt remapping table > if VT-d is enabled. Not sure how much it would cost to rewrite the > remapping table and perform the respecive VT-d interrupt entry cache > flush; it's difficult to measure without actually having any available > hardware. However, I suspect the cost would in many cases be dwarfed > by migrating the cache working set and by other associated costs of > migrating a vcpu. > > eSk _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.