[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC XEN PATCH v9 5/5] domctl: Add XEN_DOMCTL_gsi_permission to grant gsi
Hi Jan, On 2024/6/11 22:39, Jan Beulich wrote: > On 07.06.2024 10:11, Jiqian Chen wrote: >> Some type of domain don't have PIRQ, like PVH, it do not do >> PHYSDEVOP_map_pirq for each gsi. When passthrough a device >> to guest on PVH dom0, callstack >> pci_add_dm_done->XEN_DOMCTL_irq_permission will failed at >> domain_pirq_to_irq, because PVH has no mapping of gsi, pirq >> and irq on Xen side. > > All of this is, to me at least, in pretty sharp contradiction to what > patch 2 says and does. IOW: Do we want the concept of pIRQ in PVH, or > do we want to keep that to PV? It's not contradictory. What I did is not to add the concept of PIRQs for PVH. All previous passthrough code was implemented on the basis of pv dom0 + hvm domU. For pv dom0, it has PIRQs. For hvm domU, it has PIRQs too. So the codes are not suitable for PVH dom0 + hvm domU, because PVH dom0 has no PIRQs. Patch 2 do PHYSDEVOP_map_pirq for hvm domU even when dom0 is PVH instead of PV. It didn't add PIRQs for PVH. This patch is to grant irq( that get from gsi ) to hvm domU, why XEN_DOMCTL_irq_permission is not useful is because PVH has no PIRQs, we can't get irq through pirq like PV does. > >> What's more, current hypercall XEN_DOMCTL_irq_permission require >> passing in pirq and grant the access of irq, it is not suitable >> for dom0 that has no PIRQ flag, because passthrough a device >> needs gsi and grant the corresponding irq to guest. So, add a >> new hypercall to grant gsi permission when dom0 is not PV or dom0 >> has not PIRQ flag. >> >> Signed-off-by: Huang Rui <ray.huang@xxxxxxx> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> > > A problem throughout the series as it seems: Who's the author of these > patches? There's no From: saying it's not you, but your S-o-b also > isn't first. So I need to change to: Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> means I am the author. Signed-off-by: Huang Rui <ray.huang@xxxxxxx> means Rui sent them to upstream firstly. Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> means I take continue to upstream. > >> --- a/tools/libs/light/libxl_pci.c >> +++ b/tools/libs/light/libxl_pci.c >> @@ -1412,6 +1412,37 @@ static bool pci_supp_legacy_irq(void) >> #define PCI_SBDF(seg, bus, devfn) \ >> ((((uint32_t)(seg)) << 16) | (PCI_DEVID(bus, devfn))) >> >> +static int pci_device_set_gsi(libxl_ctx *ctx, >> + libxl_domid domid, >> + libxl_device_pci *pci, >> + bool map, >> + int *gsi_back) >> +{ >> + int r, gsi, pirq; >> + uint32_t sbdf; >> + >> + sbdf = PCI_SBDF(pci->domain, pci->bus, (PCI_DEVFN(pci->dev, >> pci->func))); >> + r = xc_physdev_gsi_from_dev(ctx->xch, sbdf); >> + *gsi_back = r; >> + if (r < 0) >> + return r; >> + >> + gsi = r; >> + pirq = r; > > r is a GSI as per above; why would you store such in a variable named pirq? > And how can ... > >> + if (map) >> + r = xc_physdev_map_pirq(ctx->xch, domid, gsi, &pirq); >> + else >> + r = xc_physdev_unmap_pirq(ctx->xch, domid, pirq); > > ... that value be the correct one to pass into here? In fact, the pIRQ number > you obtain above in the "map" case isn't handed to the caller, i.e. it is > effectively lost. Yet that's what would need passing into such an unmap call. Yes r is GSI and I know pirq will be replaced by xc_physdev_map_pirq. What I do "pirq = r" is for xc_physdev_unmap_pirq, unmap need passing in pirq, and the number of pirq is always equal to gsi. > >> + if (r) >> + return r; >> + >> + r = xc_domain_gsi_permission(ctx->xch, domid, gsi, map); > > Looking at the hypervisor side, this will fail for PV Dom0. In which case imo > you better would avoid making the call in the first place. Yes, for PV dom0, the errno is EOPNOTSUPP, then it will do below xc_domain_irq_permission. > >> + if (r && errno == EOPNOTSUPP) > > Before here you don't really need the pIRQ number; if all it really is needed > for is ... > >> + r = xc_domain_irq_permission(ctx->xch, domid, pirq, map); > > ... this, then it probably also should only be obtained when it's needed. Yet > overall the intentions here aren't quite clear to me. Adding the function pci_device_set_gsi is for PVH dom0, while also ensuring compatibility with PV dom0. When PVH dom0, it does xc_physdev_map_pirq and xc_domain_gsi_permission(new hypercall for PVH dom0) When PV dom0, it keeps the same actions as before codes, it does xc_physdev_map_pirq and xc_domain_irq_permission. > >> @@ -1485,6 +1516,19 @@ static void pci_add_dm_done(libxl__egc *egc, >> fclose(f); >> if (!pci_supp_legacy_irq()) >> goto out_no_irq; >> + >> + r = pci_device_set_gsi(ctx, domid, pci, 1, &gsi); >> + if (gsi >= 0) { >> + if (r < 0) { > > This unusual way of error checking likely wants a comment. Will add in next version. > >> + rc = ERROR_FAIL; >> + LOGED(ERROR, domainid, >> + "pci_device_set_gsi gsi=%d (error=%d)", gsi, errno); >> + goto out; >> + } else { >> + goto process_permissive; >> + } > > Btw, no need for "else" when the earlier if ends in "goto" or alike. OK, I will change in next version. > >> @@ -1493,13 +1537,6 @@ static void pci_add_dm_done(libxl__egc *egc, >> goto out_no_irq; >> } >> if ((fscanf(f, "%u", &irq) == 1) && irq) { >> - sbdf = PCI_SBDF(pci->domain, pci->bus, >> - (PCI_DEVFN(pci->dev, pci->func))); >> - r = xc_physdev_gsi_from_dev(ctx->xch, sbdf); >> - /* if fail, keep using irq; if success, r is gsi, use gsi */ >> - if (r != -1) { >> - irq = r; >> - } > > If I'm not mistaken, this and ... > >> @@ -2255,13 +2302,6 @@ skip_bar: >> } >> >> if ((fscanf(f, "%u", &irq) == 1) && irq) { >> - sbdf = PCI_SBDF(pci->domain, pci->bus, >> - (PCI_DEVFN(pci->dev, pci->func))); >> - r = xc_physdev_gsi_from_dev(ctx->xch, sbdf); >> - /* if fail, keep using irq; if success, r is gsi, use gsi */ >> - if (r != -1) { >> - irq = r; >> - } > > ... this is code added by the immediately preceding patch. It's pretty odd > for that to be deleted here again right away. Can the interaction of the > two patches perhaps be re-arranged to avoid this anomaly? OK, I will do in next version. > >> @@ -237,6 +238,43 @@ long arch_do_domctl( >> break; >> } >> >> + case XEN_DOMCTL_gsi_permission: >> + { >> + unsigned int gsi = domctl->u.gsi_permission.gsi; >> + int irq = gsi_2_irq(gsi); > > I'm not sure it is a good idea to issue this call ahead of the basic error > checks below. I will move it below the checks. > >> + bool allow = domctl->u.gsi_permission.allow_access; > > This allows any non-zero values to mean "true". I think you want to bail > on values larger than 1, much like you also want to check that the padding > fields are all zero. Will change in next version. > >> + /* >> + * If current domain is PV or it has PIRQ flag, it has a mapping >> + * of gsi, pirq and irq, so it should use XEN_DOMCTL_irq_permission >> + * to grant irq permission. >> + */ >> + if ( is_pv_domain(current->domain) || has_pirq(current->domain) ) > > Please use currd here (and also again below). Will change in next version. > >> + { >> + ret = -EOPNOTSUPP; >> + break; >> + } >> + >> + if ( gsi >= nr_irqs_gsi || irq < 0 ) >> + { >> + ret = -EINVAL; >> + break; >> + } >> + >> + if ( !irq_access_permitted(current->domain, irq) || >> + xsm_irq_permission(XSM_HOOK, d, irq, allow) ) > > Daniel, is it okay to issue the XSM check using the translated value, not > the one that was originally passed into the hypercall? > >> --- a/xen/arch/x86/io_apic.c >> +++ b/xen/arch/x86/io_apic.c >> @@ -955,6 +955,27 @@ static int pin_2_irq(int idx, int apic, int pin) >> return irq; >> } >> >> +int gsi_2_irq(int gsi) >> +{ >> + int entry, ioapic, pin; >> + >> + ioapic = mp_find_ioapic(gsi); >> + if ( ioapic < 0 ) >> + return -1; > > Can this be a proper errno value (likely -EINVAL), please? Will change in next version. > >> + pin = gsi - io_apic_gsi_base(ioapic); > > Hmm, instead of the below: Once you have an (ioapic,pin) tuple, can't > you simply call apic_pin_2_gsi_irq()? Will change in next version. > >> + entry = find_irq_entry(ioapic, pin, mp_INT); >> + /* >> + * If there is no override mapping for irq and gsi in mp_irqs, >> + * then the default identity mapping applies. >> + */ >> + if ( entry < 0 ) >> + return gsi; >> + >> + return pin_2_irq(entry, ioapic, pin); > > Under certain conditions this may return 0. Yet you surely don't want > to pass IRQ0 back as a result; you want to hand an error back instead. Will add in next version. > >> --- a/xen/include/public/domctl.h >> +++ b/xen/include/public/domctl.h >> @@ -465,6 +465,14 @@ struct xen_domctl_irq_permission { >> }; >> >> >> +/* XEN_DOMCTL_gsi_permission */ >> +struct xen_domctl_gsi_permission { >> + uint32_t gsi; >> + uint8_t allow_access; /* flag to specify enable/disable of x86 gsi >> access */ >> + uint8_t pad[3]; >> +}; >> + >> + > > Nit: No (new) double blank lines please. In fact (didn't I say this before > already?) you could insert between the two above, such that the existing issue > also disappears. I remember I changed it here, maybe I made a mistake somewhere, and I will modify it in the next version. > > Jan -- Best regards, Jiqian Chen.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |