[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 01/28] Xen/doc: Add Xen virtual IOMMU doc
On Fri, Feb 09, 2018 at 12:54:11PM +0000, Roger Pau Monné wrote: >On Fri, Nov 17, 2017 at 02:22:08PM +0800, Chao Gao wrote: >> From: Lan Tianyu <tianyu.lan@xxxxxxxxx> >> >> This patch is to add Xen virtual IOMMU doc to introduce motivation, >> framework, vIOMMU hypercall and xl configuration. >> >> Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx> >> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> >> --- >> docs/misc/viommu.txt | 120 >> +++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 120 insertions(+) >> create mode 100644 docs/misc/viommu.txt >> >> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt >> new file mode 100644 >> index 0000000..472d2b5 >> --- /dev/null >> +++ b/docs/misc/viommu.txt >> @@ -0,0 +1,120 @@ >> +Xen virtual IOMMU >> + >> +Motivation >> +========== >> +Enable more than 128 vcpu support >> + >> +The current requirements of HPC cloud service requires VM with a high >> +number of CPUs in order to achieve high performance in parallel >> +computing. >> + >> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy >> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is >> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available >> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode >> +supports 32-bit APIC ID and it requires the interrupt remapping >> functionality >> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs >> + >> +PCI MSI/IOAPIC can only send interrupt message containing 8-bit APIC ID, >> +which cannot address cpus with >254 APIC ID. Interrupt remapping supports >> +32-bit APIC ID and so it's necessary for >128 vcpus support. >> + >> +vIOMMU Architecture >> +=================== >> +vIOMMU device model is inside Xen hypervisor for following factors >> + 1) Avoid round trips between Qemu and Xen hypervisor >> + 2) Ease of integration with the rest of hypervisor >> + 3) PVH doesn't use Qemu >> + >> +* Interrupt remapping overview. >> +Interrupts from virtual devices and physical devices are delivered >> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during >> +this procedure. >> + >> ++---------------------------------------------------+ >> +|Qemu |VM | >> +| | +----------------+ | >> +| | | Device driver | | >> +| | +--------+-------+ | >> +| | ^ | >> +| +----------------+ | +--------+-------+ | >> +| | Virtual device | | | IRQ subsystem | | >> +| +-------+--------+ | +--------+-------+ | >> +| | | ^ | >> +| | | | | >> ++---------------------------+-----------------------+ >> +|hypervisor | | VIRQ | >> +| | +---------+--------+ | >> +| | | vLAPIC | | >> +| |VIRQ +---------+--------+ | >> +| | ^ | >> +| | | | >> +| | +---------+--------+ | >> +| | | vIOMMU | | >> +| | +---------+--------+ | >> +| | ^ | >> +| | | | >> +| | +---------+--------+ | >> +| | | vIOAPIC/vMSI | | >> +| | +----+----+--------+ | >> +| | ^ ^ | >> +| +-----------------+ | | >> +| | | >> ++---------------------------------------------------+ >> +HW |IRQ >> + +-------------------+ >> + | PCI Device | >> + +-------------------+ >> + >> + >> +vIOMMU hypercall >> +================ >> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create >> +vIOMMUs instance in hypervisor. vIOMMU instance will be destroyed >> +during destroying domain. >> + >> +* vIOMMU hypercall parameter structure >> + >> +/* vIOMMU type - specify vendor vIOMMU device model */ >> +#define VIOMMU_TYPE_INTEL_VTD 0 >> + >> +/* vIOMMU capabilities */ >> +#define VIOMMU_CAP_IRQ_REMAPPING (1u << 0) >> + >> +struct xen_domctl_viommu_op { >> + uint32_t cmd; >> +#define XEN_DOMCTL_viommu_create 0 >> + union { >> + struct { >> + /* IN - vIOMMU type */ >> + uint8_t type; >> + /* IN - MMIO base address of vIOMMU. */ >> + uint64_t base_address; >> + /* IN - Capabilities with which we want to create */ >> + uint64_t capabilities; >> + /* OUT - vIOMMU identity */ >> + uint32_t id; >> + } create; >> + } u; >> +}; >> + >> +- XEN_DOMCTL_create_viommu >> + Create vIOMMU device with type, capabilities and MMIO base address. >> +Hypervisor allocates viommu_id for new vIOMMU instance and return back. >> +The vIOMMU device model in hypervisor should check whether it can >> +support the input capabilities and return error if not. >> + >> +vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU >> +support for single VM.(e.g, parameters of create vIOMMU includes vIOMMU id). >> +But function implementation only supports one vIOMMU per VM so far. >> + >> +xl x86 vIOMMU configuration" >> +============================ >> +viommu = [ >> + 'type=intel_vtd,intremap=1', >> + ... >> +] >> + >> +"type" - Specify vIOMMU device model type. Currently only supports Intel vtd >> +device model. > >Although I see the point in being able to specify the vIOMMU type, is >this really helpful from an admin PoV? > >What would happen for example if you try to add an Intel vIOMMU to a >guest running on an AMD CPU? I guess the guest OSes would be quite >surprised about that... > >I think the most common way to use this option would be: > >viommu = [ > 'intremap=1', > ... >] Agree it. > >And vIOMMUs should automatically be added to guests with > 128 vCPUs? >IIRC Linux requires a vIOMMU in order to run with > 128 vCPUs (which >is quite arbitrary, but anyway...). I think linux will only use 128 CPUs for this case on bare-metal. Considering a benign VM shouldn't has a weird configuration -- has > 128 vcpus but has no viommu, adding vIOMMUs automatically when needed is fine with me. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |