[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote: > This patch is to add Xen virtual IOMMU doc to introduce motivation, > framework, vIOMMU hypercall and xl configuration. > > Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx> > --- > docs/misc/viommu.txt | 136 > +++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 136 insertions(+) > create mode 100644 docs/misc/viommu.txt > > diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt > new file mode 100644 > index 0000000..348e8c4 > --- /dev/null > +++ b/docs/misc/viommu.txt > @@ -0,0 +1,136 @@ > +Xen virtual IOMMU > + > +Motivation > +========== > +Enable more than 128 vcpu support > + > +The current requirements of HPC cloud service requires VM with a high > +number of CPUs in order to achieve high performance in parallel > +computing. > + > +To support >128 vcpus, X2APIC mode in guest is necessary because legacy > +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is > +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available > +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode > +supports 32-bit APIC ID and it requires the interrupt remapping functionality > +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs > + > +The reason for this is that there is no modification for existing PCI MSI > +and IOAPIC when introduce X2APIC. I'm not sure the above sentence makes much sense. IMHO I would just remove it. > PCI MSI/IOAPIC can only send interrupt > +message containing 8-bit APIC ID, which cannot address cpus with >254 > +APIC ID. Interrupt remapping supports 32-bit APIC ID and so it's necessary > +for >128 vcpus support. > + > + > +vIOMMU Architecture > +=================== > +vIOMMU device model is inside Xen hypervisor for following factors > + 1) Avoid round trips between Qemu and Xen hypervisor > + 2) Ease of integration with the rest of hypervisor > + 3) HVMlite/PVH doesn't use Qemu Just use PVH here, HVMlite == PVH now. > + > +* Interrupt remapping overview. > +Interrupts from virtual devices and physical devices are delivered > +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during > +this procedure. > + > ++---------------------------------------------------+ > +|Qemu |VM | > +| | +----------------+ | > +| | | Device driver | | > +| | +--------+-------+ | > +| | ^ | > +| +----------------+ | +--------+-------+ | > +| | Virtual device | | | IRQ subsystem | | > +| +-------+--------+ | +--------+-------+ | > +| | | ^ | > +| | | | | > ++---------------------------+-----------------------+ > +|hypervisor | | VIRQ | > +| | +---------+--------+ | > +| | | vLAPIC | | > +| |VIRQ +---------+--------+ | > +| | ^ | > +| | | | > +| | +---------+--------+ | > +| | | vIOMMU | | > +| | +---------+--------+ | > +| | ^ | > +| | | | > +| | +---------+--------+ | > +| | | vIOAPIC/vMSI | | > +| | +----+----+--------+ | > +| | ^ ^ | > +| +-----------------+ | | > +| | | > ++---------------------------------------------------+ > +HW |IRQ > + +-------------------+ > + | PCI Device | > + +-------------------+ > + > + > +vIOMMU hypercall > +================ > +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create/destroy > +vIOMMUs. > + > +* vIOMMU hypercall parameter structure > + > +/* vIOMMU type - specify vendor vIOMMU device model */ > +#define VIOMMU_TYPE_INTEL_VTD 0 > + > +/* vIOMMU capabilities */ > +#define VIOMMU_CAP_IRQ_REMAPPING (1u << 0) > + > +struct xen_domctl_viommu_op { > + uint32_t cmd; > +#define XEN_DOMCTL_create_viommu 0 > +#define XEN_DOMCTL_destroy_viommu 1 I would invert the order of the domctl names: #define XEN_DOMCTL_viommu_create 0 #define XEN_DOMCTL_viommu_destroy 1 It's clearer if the operation is the last part of the name. > + union { > + struct { > + /* IN - vIOMMU type */ > + uint64_t viommu_type; Hm, do we really need a uint64_t for the IOMMU type? A uint8_t should be more that enough (256 different IOMMU implementations). > + /* IN - MMIO base address of vIOMMU. */ > + uint64_t base_address; > + /* IN - Capabilities with which we want to create */ > + uint64_t capabilities; > + /* OUT - vIOMMU identity */ > + uint32_t viommu_id; > + } create_viommu; > + > + struct { > + /* IN - vIOMMU identity */ > + uint32_t viommu_id; > + } destroy_viommu; Do you really need the destroy operation? Do we expect to hot-unplug vIOMMUs? Otherwise vIOMMUs should be removed when the domain is destroyed. > + } u; > +}; > + > +- XEN_DOMCTL_create_viommu > + Create vIOMMU device with vIOMMU_type, capabilities and MMIO base > +address. Hypervisor allocates viommu_id for new vIOMMU instance and return > +back. The vIOMMU device model in hypervisor should check whether it can > +support the input capabilities and return error if not. > + > +- XEN_DOMCTL_destroy_viommu > + Destroy vIOMMU in Xen hypervisor with viommu_id as parameter. > + > +These vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU > +support for single VM.(e.g, parameters of create/destroy vIOMMU includes > +vIOMMU id). But function implementation only supports one vIOMMU per VM so > far. > + > +Xen hypervisor vIOMMU command > +============================= > +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor. > +It's default disabled. Hm, I'm not sure we really need this. At the end viommu will be disabled by default for guests, unless explicitly enabled in the config file. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |