[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen virtual IOMMU high level design doc V2

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "Lan, Tianyu" <tianyu.lan@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, "yang.zhang.wz@xxxxxxxxx" <yang.zhang.wz@xxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date: Thu, 20 Oct 2016 19:10:20 +0100
Cc: "anthony.perard@xxxxxxxxxx" <anthony.perard@xxxxxxxxxx>, "xuquan8@xxxxxxxxxx" <xuquan8@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "ian.jackson@xxxxxxxxxxxxx" <ian.jackson@xxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
Delivery-date: Thu, 20 Oct 2016 18:10:41 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 20/10/16 10:53, Tian, Kevin wrote:
>> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
>> Sent: Wednesday, October 19, 2016 3:18 AM
>>
>>> 1.2 Support VFIO-based user space driver (e.g. DPDK) in the guest
>>> It relies on the l2 translation capability (IOVA->GPA) on
>>> vIOMMU. pIOMMU l2 becomes a shadowing structure of
>>> vIOMMU to isolate DMA requests initiated by user space driver.
>> How is userspace supposed to drive this interface?  I can't picture how
>> it would function.
> Inside a Linux VM, VFIO provides DMA MAP/UNMAP interface to user space
> driver so gIOVA->GPA mapping can be setup on vIOMMU. vIOMMU will 
> export a "caching mode" capability to indicate all guest PTE changes 
> requiring explicit vIOMMU cache invalidations. Through trapping of those
> invalidation requests, Xen can update corresponding shadow PTEs (gIOVA
> ->HPA). When DMA mapping is established, user space driver programs 
> gIOVA addresses as DMA destination to assigned device, and then upstreaming
> DMA request out of this device contains gIOVA which is translated to HPA
> by pIOMMU shadow page table.

Ok.  So in this mode, the userspace driver owns the device, and can
choose any arbitrary gIOVA layout it chooses?  If it also programs the
DMA addresses, I guess this setup is fine.

>
>>>
>>> 1.3 Support guest SVM (Shared Virtual Memory)
>>> It relies on the l1 translation table capability (GVA->GPA) on
>>> vIOMMU. pIOMMU needs to enable both l1 and l2 translation in nested
>>> mode (GVA->GPA->HPA) for passthrough device. IGD passthrough
>>> is the main usage today (to support OpenCL 2.0 SVM feature). In the
>>> future SVM might be used by other I/O devices too.
>> As an aside, how is IGD intending to support SVM?  Will it be with PCIe
>> ATS/PASID, or something rather more magic as IGD is on the same piece of
>> silicon?
> Although integrated, IGD conforms to standard PCIe PASID convention.

Ok.  Any idea when hardware with SVM will be available?

>
>>> 3.5 Implementation consideration
>>> VT-d spec doesn't define a capability bit for the l2 translation.
>>> Architecturally there is no way to tell guest that l2 translation
>>> capability is not available. Linux Intel IOMMU driver thinks l2
>>> translation is always available when VTD exits and fail to be loaded
>>> without l2 translation support even if interrupt remapping and l1
>>> translation are available. So it needs to enable l2 translation first
>>> before other functions.
>> What then is the purpose of the nested translation support bit in the
>> extended capability register?
>>
> Nested translation is for SVM virtualization. Given a DMA transaction 
> containing a PASID, VT-d engine first finds the 1st translation table 
> through PASID to translate from GVA to GPA, then once nested
> translation capability is enabled, further translate GPA to HPA using the
> 2nd level translation table. Bare-metal usage is not expected to turn
> on this nested bit.

Ok, but what happens if a guest sees a PASSID-capable vIOMMU and itself
tries to turn on nesting?  E.g. nesting KVM inside Xen and trying to use
SVM from the L2 guest?

If there is no way to indicate to the L1 guest that nesting isn't
available (as it is already actually in use), and we can't shadow
entries on faults, what is supposed to happen?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

References:
- [Xen-devel] Xen virtual IOMMU high level design doc V2
  - From: Lan Tianyu
- Re: [Xen-devel] Xen virtual IOMMU high level design doc V2
  - From: Andrew Cooper
- Re: [Xen-devel] Xen virtual IOMMU high level design doc V2
  - From: Tian, Kevin

Prev by Date: Re: [Xen-devel] QEMU XenServer/XenProject Working group meeting 29th September 2016
Next by Date: [Xen-devel] [ovmf test] 101562: all pass - PUSHED
Previous by thread: Re: [Xen-devel] Xen virtual IOMMU high level design doc V2
Next by thread: Re: [Xen-devel] Xen virtual IOMMU high level design doc V2
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.