[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
Hi Roger: Thanks for review. On 2017年10月18日 21:26, Roger Pau Monné wrote: > On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote: >> This patch is to add Xen virtual IOMMU doc to introduce motivation, >> framework, vIOMMU hypercall and xl configuration. >> >> Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx> >> --- >> docs/misc/viommu.txt | 136 >> +++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 136 insertions(+) >> create mode 100644 docs/misc/viommu.txt >> >> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt >> new file mode 100644 >> index 0000000..348e8c4 >> --- /dev/null >> +++ b/docs/misc/viommu.txt >> @@ -0,0 +1,136 @@ >> +Xen virtual IOMMU >> + >> +Motivation >> +========== >> +Enable more than 128 vcpu support >> + >> +The current requirements of HPC cloud service requires VM with a high >> +number of CPUs in order to achieve high performance in parallel >> +computing. >> + >> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy >> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is >> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available >> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode >> +supports 32-bit APIC ID and it requires the interrupt remapping >> functionality >> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs >> + >> +The reason for this is that there is no modification for existing PCI MSI >> +and IOAPIC when introduce X2APIC. > > I'm not sure the above sentence makes much sense. IMHO I would just > remove it. OK. Will remove. > >> PCI MSI/IOAPIC can only send interrupt >> +message containing 8-bit APIC ID, which cannot address cpus with >254 >> +APIC ID. Interrupt remapping supports 32-bit APIC ID and so it's necessary >> +for >128 vcpus support. >> + >> + >> +vIOMMU Architecture >> +=================== >> +vIOMMU device model is inside Xen hypervisor for following factors >> + 1) Avoid round trips between Qemu and Xen hypervisor >> + 2) Ease of integration with the rest of hypervisor >> + 3) HVMlite/PVH doesn't use Qemu > > Just use PVH here, HVMlite == PVH now. OK. > >> + >> +* Interrupt remapping overview. >> +Interrupts from virtual devices and physical devices are delivered >> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during >> +this procedure. >> + >> ++---------------------------------------------------+ >> +|Qemu |VM | >> +| | +----------------+ | >> +| | | Device driver | | >> +| | +--------+-------+ | >> +| | ^ | >> +| +----------------+ | +--------+-------+ | >> +| | Virtual device | | | IRQ subsystem | | >> +| +-------+--------+ | +--------+-------+ | >> +| | | ^ | >> +| | | | | >> ++---------------------------+-----------------------+ >> +|hypervisor | | VIRQ | >> +| | +---------+--------+ | >> +| | | vLAPIC | | >> +| |VIRQ +---------+--------+ | >> +| | ^ | >> +| | | | >> +| | +---------+--------+ | >> +| | | vIOMMU | | >> +| | +---------+--------+ | >> +| | ^ | >> +| | | | >> +| | +---------+--------+ | >> +| | | vIOAPIC/vMSI | | >> +| | +----+----+--------+ | >> +| | ^ ^ | >> +| +-----------------+ | | >> +| | | >> ++---------------------------------------------------+ >> +HW |IRQ >> + +-------------------+ >> + | PCI Device | >> + +-------------------+ >> + >> + >> +vIOMMU hypercall >> +================ >> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create/destroy >> +vIOMMUs. >> + >> +* vIOMMU hypercall parameter structure >> + >> +/* vIOMMU type - specify vendor vIOMMU device model */ >> +#define VIOMMU_TYPE_INTEL_VTD 0 >> + >> +/* vIOMMU capabilities */ >> +#define VIOMMU_CAP_IRQ_REMAPPING (1u << 0) >> + >> +struct xen_domctl_viommu_op { >> + uint32_t cmd; >> +#define XEN_DOMCTL_create_viommu 0 >> +#define XEN_DOMCTL_destroy_viommu 1 > > I would invert the order of the domctl names: > > #define XEN_DOMCTL_viommu_create 0 > #define XEN_DOMCTL_viommu_destroy 1 > > It's clearer if the operation is the last part of the name. OK. Will update. > >> + union { >> + struct { >> + /* IN - vIOMMU type */ >> + uint64_t viommu_type; > > Hm, do we really need a uint64_t for the IOMMU type? A uint8_t should > be more that enough (256 different IOMMU implementations). OK. Will update. > >> + /* IN - MMIO base address of vIOMMU. */ >> + uint64_t base_address; >> + /* IN - Capabilities with which we want to create */ >> + uint64_t capabilities; >> + /* OUT - vIOMMU identity */ >> + uint32_t viommu_id; >> + } create_viommu; >> + >> + struct { >> + /* IN - vIOMMU identity */ >> + uint32_t viommu_id; >> + } destroy_viommu; > > Do you really need the destroy operation? Do we expect to hot-unplug > vIOMMUs? Otherwise vIOMMUs should be removed when the domain is > destroyed. Yes. no such requirement so far and added it just for multi-vIOMMU consideration. I will remove it and add back when it's really needed. > >> + } u; >> +}; >> + >> +- XEN_DOMCTL_create_viommu >> + Create vIOMMU device with vIOMMU_type, capabilities and MMIO base >> +address. Hypervisor allocates viommu_id for new vIOMMU instance and return >> +back. The vIOMMU device model in hypervisor should check whether it can >> +support the input capabilities and return error if not. >> + >> +- XEN_DOMCTL_destroy_viommu >> + Destroy vIOMMU in Xen hypervisor with viommu_id as parameter. >> + >> +These vIOMMU domctl and vIOMMU option in configure file consider >> multi-vIOMMU >> +support for single VM.(e.g, parameters of create/destroy vIOMMU includes >> +vIOMMU id). But function implementation only supports one vIOMMU per VM so >> far. >> + >> +Xen hypervisor vIOMMU command >> +============================= >> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor. >> +It's default disabled. > > Hm, I'm not sure we really need this. At the end viommu will be > disabled by default for guests, unless explicitly enabled in the > config file. This is according to Jan's early comments on RFC patch https://patchwork.kernel.org/patch/9733869/. "It's actually a question whether in our current scheme a Kconfig option is appropriate here in the first place. I'd rather see this be an always built feature which needs enabling on the command line for the time being." > > Thanks, Roger. > -- Best regards Tianyu Lan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |