[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen virtual IOMMU high level design doc V2




On 10/19/2016 4:26 AM, Konrad Rzeszutek Wilk wrote:
On Tue, Oct 18, 2016 at 10:14:16PM +0800, Lan Tianyu wrote:

1 Motivation for Xen vIOMMU
===============================================================================
1.1 Enable more than 255 vcpu support
HPC cloud service requires VM provides high performance parallel
computing and we hope to create a huge VM with >255 vcpu on one machine
to meet such requirement.Ping each vcpus on separated pcpus. More than
255 vcpus support requires X2APIC and Linux disables X2APIC mode if
there is no interrupt remapping function which is present by vIOMMU.
Interrupt remapping function helps to deliver interrupt to #vcpu >255.
So we need to add vIOMMU before enabling >255 vcpus.

What about Windows? Does it care about this?

From our test, win8 guest crashes when boot up 288 vcpus without IR and it can boot up with IR

3.2 l2 translation
1) For virtual PCI device
Xen dummy xen-vIOMMU in Qemu translates IOVA to target GPA via new
hypercall when DMA operation happens.

2) For physical PCI device
DMA operations go though physical IOMMU directly and IO page table for
IOVA->HPA should be loaded into physical IOMMU. When guest updates
l2 Page-table pointer field, it provides IO page table for
IOVA->GPA. vIOMMU needs to shadow l2 translation table, translate
GPA->HPA and update shadow page table(IOVA->HPA) pointer to l2
Page-table pointer to context entry of physical IOMMU.

Now all PCI devices in same hvm domain share one IO page table
(GPA->HPA) in physical IOMMU driver of Xen. To support l2
translation of vIOMMU, IOMMU driver need to support multiple address
spaces per device entry. Using existing IO page table(GPA->HPA)
defaultly and switch to shadow IO page table(IOVA->HPA) when l2

defaultly?

I mean GPA->HPA mapping will set in the assigned device's context entry of pIOMMU when VM creates. Just like current code works.



3.3 Interrupt remapping
Interrupts from virtual devices and physical devices will be delivered
to vlapic from vIOAPIC and vMSI. It needs to add interrupt remapping
hooks in the vmsi_deliver() and ioapic_deliver() to find target vlapic
according interrupt remapping table.


3.4 l1 translation
When nested translation is enabled, any address generated by l1
translation is used as the input address for nesting with l2
translation. Physical IOMMU needs to enable both l1 and l2 translation
in nested translation mode(GVA->GPA->HPA) for passthrough
device.

VT-d context entry points to guest l1 translation table which
will be nest-translated by l2 translation table and so it
can be directly linked to context entry of physical IOMMU.

I think this means that the shared_ept will be disabled?

The shared_ept(GPA->HPA mapping) is used to do nested translation
for any output from l1 translation(GVA->GPA).




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.