[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 11/11] xen/arm: Translate virtual PCI bus topology for guests
On 27.09.21 14:31, Jan Beulich wrote: > On 23.09.2021 14:55, Oleksandr Andrushchenko wrote: >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >> >> There are three originators for the PCI configuration space access: >> 1. The domain that owns physical host bridge: MMIO handlers are >> there so we can update vPCI register handlers with the values >> written by the hardware domain, e.g. physical view of the registers >> vs guest's view on the configuration space. >> 2. Guest access to the passed through PCI devices: we need to properly >> map virtual bus topology to the physical one, e.g. pass the configuration >> space access to the corresponding physical devices. >> 3. Emulated host PCI bridge access. It doesn't exist in the physical >> topology, e.g. it can't be mapped to some physical host bridge. >> So, all access to the host bridge itself needs to be trapped and >> emulated. >> >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >> >> --- >> New in v2 >> --- >> xen/arch/arm/domain.c | 1 + >> xen/arch/arm/vpci.c | 87 +++++++++++++++++++++++++++++++---- >> xen/arch/arm/vpci.h | 3 ++ >> xen/drivers/passthrough/pci.c | 25 ++++++++++ >> xen/include/asm-arm/pci.h | 1 + >> xen/include/xen/pci.h | 1 + >> xen/include/xen/sched.h | 2 + >> 7 files changed, 111 insertions(+), 9 deletions(-) >> >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c >> index c7b25bc70439..c0ad6ad682d2 100644 >> --- a/xen/arch/arm/domain.c >> +++ b/xen/arch/arm/domain.c >> @@ -797,6 +797,7 @@ void arch_domain_destroy(struct domain *d) >> get_order_from_bytes(d->arch.efi_acpi_len)); >> #endif >> domain_io_free(d); >> + domain_vpci_free(d); >> } >> >> void arch_domain_shutdown(struct domain *d) >> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c >> index 14947e975d69..012f958960d1 100644 >> --- a/xen/arch/arm/vpci.c >> +++ b/xen/arch/arm/vpci.c >> @@ -17,6 +17,14 @@ >> >> #define REGISTER_OFFSET(addr) ( (addr) & 0x00000fff) >> >> +struct vpci_mmio_priv { >> + /* >> + * Set to true if the MMIO handlers were set up for the emulated >> + * ECAM host PCI bridge. >> + */ >> + bool is_virt_ecam; >> +}; >> + >> /* Do some sanity checks. */ >> static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len) >> { >> @@ -38,6 +46,7 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t >> *info, >> pci_sbdf_t sbdf; >> unsigned long data = ~0UL; >> unsigned int size = 1U << info->dabt.size; >> + struct vpci_mmio_priv *priv = (struct vpci_mmio_priv *)p; >> >> sbdf.sbdf = MMCFG_BDF(info->gpa); >> reg = REGISTER_OFFSET(info->gpa); >> @@ -45,6 +54,13 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t >> *info, >> if ( !vpci_mmio_access_allowed(reg, size) ) >> return 0; >> >> + /* >> + * For the passed through devices we need to map their virtual SBDF >> + * to the physical PCI device being passed through. >> + */ >> + if ( priv->is_virt_ecam && !pci_translate_virtual_device(v, &sbdf) ) >> + return 1; >> + >> data = vpci_read(sbdf, reg, min(4u, size)); >> if ( size == 8 ) >> data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32; >> @@ -61,6 +77,7 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t >> *info, >> pci_sbdf_t sbdf; >> unsigned long data = r; >> unsigned int size = 1U << info->dabt.size; >> + struct vpci_mmio_priv *priv = (struct vpci_mmio_priv *)p; >> >> sbdf.sbdf = MMCFG_BDF(info->gpa); >> reg = REGISTER_OFFSET(info->gpa); >> @@ -68,6 +85,13 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t >> *info, >> if ( !vpci_mmio_access_allowed(reg, size) ) >> return 0; >> >> + /* >> + * For the passed through devices we need to map their virtual SBDF >> + * to the physical PCI device being passed through. >> + */ >> + if ( priv->is_virt_ecam && !pci_translate_virtual_device(v, &sbdf) ) >> + return 1; >> + >> vpci_write(sbdf, reg, min(4u, size), data); >> if ( size == 8 ) >> vpci_write(sbdf, reg + 4, 4, data >> 32); >> @@ -80,13 +104,48 @@ static const struct mmio_handler_ops vpci_mmio_handler >> = { >> .write = vpci_mmio_write, >> }; >> >> +/* >> + * There are three originators for the PCI configuration space access: >> + * 1. The domain that owns physical host bridge: MMIO handlers are >> + * there so we can update vPCI register handlers with the values >> + * written by the hardware domain, e.g. physical view of the registers/ >> + * configuration space. >> + * 2. Guest access to the passed through PCI devices: we need to properly >> + * map virtual bus topology to the physical one, e.g. pass the >> configuration >> + * space access to the corresponding physical devices. >> + * 3. Emulated host PCI bridge access. It doesn't exist in the physical >> + * topology, e.g. it can't be mapped to some physical host bridge. >> + * So, all access to the host bridge itself needs to be trapped and >> + * emulated. >> + */ >> static int vpci_setup_mmio_handler(struct domain *d, >> struct pci_host_bridge *bridge) >> { >> - struct pci_config_window *cfg = bridge->cfg; >> + struct vpci_mmio_priv *priv; >> + >> + priv = xzalloc(struct vpci_mmio_priv); >> + if ( !priv ) >> + return -ENOMEM; >> + >> + priv->is_virt_ecam = !is_hardware_domain(d); >> >> - register_mmio_handler(d, &vpci_mmio_handler, >> - cfg->phys_addr, cfg->size, NULL); >> + if ( is_hardware_domain(d) ) >> + { >> + struct pci_config_window *cfg = bridge->cfg; >> + >> + bridge->mmio_priv = priv; >> + register_mmio_handler(d, &vpci_mmio_handler, >> + cfg->phys_addr, cfg->size, >> + priv); >> + } >> + else >> + { >> + d->vpci_mmio_priv = priv; >> + /* Guest domains use what is programmed in their device tree. */ >> + register_mmio_handler(d, &vpci_mmio_handler, >> + GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, >> + priv); >> + } >> return 0; >> } >> >> @@ -95,13 +154,16 @@ int domain_vpci_init(struct domain *d) >> if ( !has_vpci(d) ) >> return 0; >> >> - if ( is_hardware_domain(d) ) >> - return pci_host_iterate_bridges(d, vpci_setup_mmio_handler); >> - >> - /* Guest domains use what is programmed in their device tree. */ >> - register_mmio_handler(d, &vpci_mmio_handler, >> - GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL); >> + return pci_host_iterate_bridges(d, vpci_setup_mmio_handler); >> +} >> >> +static int domain_vpci_free_cb(struct domain *d, >> + struct pci_host_bridge *bridge) >> +{ >> + if ( is_hardware_domain(d) ) >> + XFREE(bridge->mmio_priv); >> + else >> + XFREE(d->vpci_mmio_priv); >> return 0; >> } >> >> @@ -124,6 +186,13 @@ int domain_vpci_get_num_mmio_handlers(struct domain *d) >> return count; >> } >> >> +void domain_vpci_free(struct domain *d) >> +{ >> + if ( !has_vpci(d) ) >> + return; >> + >> + pci_host_iterate_bridges(d, domain_vpci_free_cb); >> +} >> /* >> * Local variables: >> * mode: C >> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h >> index 27a2b069abd2..38e5a28c0d95 100644 >> --- a/xen/arch/arm/vpci.h >> +++ b/xen/arch/arm/vpci.h >> @@ -18,6 +18,7 @@ >> #ifdef CONFIG_HAS_VPCI >> int domain_vpci_init(struct domain *d); >> int domain_vpci_get_num_mmio_handlers(struct domain *d); >> +void domain_vpci_free(struct domain *d); >> #else >> static inline int domain_vpci_init(struct domain *d) >> { >> @@ -28,6 +29,8 @@ static inline int domain_vpci_get_num_mmio_handlers(struct >> domain *d) >> { >> return 0; >> } >> + >> +static inline void domain_vpci_free(struct domain *d) { } >> #endif >> >> #endif /* __ARCH_ARM_VPCI_H__ */ >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >> index 4552ace855e0..579c6947cc35 100644 >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -890,6 +890,31 @@ int pci_remove_virtual_device(struct domain *d, const >> struct pci_dev *pdev) >> return 0; >> } >> >> +/* >> + * Find the physical device which is mapped to the virtual device >> + * and translate virtual SBDF to the physical one. >> + */ >> +bool pci_translate_virtual_device(struct vcpu *v, pci_sbdf_t *sbdf) > Why struct vcpu, when you only need ... > >> +{ >> + struct domain *d = v->domain; > ... this? It's also not really logical for this function to take a > struct vcpu, as the translation should be uniform within a domain. Agree, struct domain is just enough > > Also - const please (as said elsewhere before, ideally wherever possible > and sensible). Ok > >> + struct vpci_dev *vdev; >> + bool found = false; >> + >> + pcidevs_lock(); >> + list_for_each_entry ( vdev, &d->vdev_list, list ) >> + { >> + if ( vdev->sbdf.sbdf == sbdf->sbdf ) >> + { >> + /* Replace virtual SBDF with the physical one. */ >> + *sbdf = vdev->pdev->sbdf; >> + found = true; >> + break; >> + } >> + } > For a DomU with just one or at most a couple of devices, such a brute > force lookup may be fine. What about Dom0 though? The physical topology > gets split at the segment level, so maybe this would by a reasonable > granularity here as well? Not sure I am following why topology matters here. We are just trying to match one SBDF (as seen by the guest) to other SBDF (physical, as seen by Dom0), so we can proxy DomU's configuration space access to the proper device in Dom0. > >> + pcidevs_unlock(); >> + return found; > Nit: Blank line please ahead of the main "return" of a function. Sure > >> +} >> + >> /* Caller should hold the pcidevs_lock */ >> static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus, >> uint8_t devfn) > Seeing this function in context (which patch 2 adds without any #ifdef > around it afaics), I believe you are talking about vpci_deassign_device here vpci_{assign|deassign}_device seem to be not called on x86 PVH as of now, this is true. > will this new function needlessly be built on x86 as > well? It will at the moment. But in order to avoid ifdefery I would like to still implement it as an empty function for x86. > (I didn't look at other intermediate patches yet, so please > forgive if I've missed the addition of an #ifdef.) So, I can gate this with HAS_VPCI_GUEST_SUPPORT in patch 2 (HAS_VPCI_GUEST_SUPPORT is introduced in patch 4, so I'll move it to 2) Does this sound good? > > Jan > Thank you, Oleksandr
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |