[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] vpci/msix: handle accesses adjacent to the MSI-X table
On Tue, Mar 14, 2023 at 12:56:33PM +0100, Jan Beulich wrote: > On 14.03.2023 11:13, Roger Pau Monne wrote: > > --- a/xen/drivers/vpci/msix.c > > +++ b/xen/drivers/vpci/msix.c > > @@ -27,6 +27,13 @@ > > ((addr) >= vmsix_table_addr(vpci, nr) && \ > > (addr) < vmsix_table_addr(vpci, nr) + vmsix_table_size(vpci, nr)) > > > > +#define VMSIX_ADDR_ADJACENT(addr, vpci, nr) \ > > + ((PFN_DOWN(addr) == PFN_DOWN(vmsix_table_addr(vpci, nr)) && \ > > + (addr) < vmsix_table_addr(vpci, nr)) || \ > > + (PFN_DOWN(addr) == PFN_DOWN(vmsix_table_addr(vpci, nr) + \ > > + vmsix_table_size(vpci, nr) - 1) && \ > > + (addr) >= vmsix_table_addr(vpci, nr) + vmsix_table_size(vpci, nr))) > > While I realize this may impact performance a little, did you consider > using !VMSIX_ADDR_IN_RANGE() instead of open-coding it kind of? Then > again there's only a single use of the macro, and that's in code where > VMSIX_ADDR_IN_RANGE() was already checked (by the earlier invocation > of msix_find()), so the re-checking of the MSI-X table bounds isn't > strictly necessary anyway. I didn't want to rely on the order of execution of the MMIO handlers, so I would rather make sure that the newly added one would work correctly if it turns to be checked for before the MSIX table handling one. I could indeed use !VMSIX_ADDR_IN_RANGE() if that is clearer. > > @@ -438,6 +369,145 @@ static const struct hvm_mmio_ops vpci_msix_table_ops > > = { > > .write = msix_write, > > }; > > > > +const static struct vpci_msix *adjacent_find(const struct domain *d, > > + unsigned long addr) > > +{ > > + const struct vpci_msix *msix; > > + > > + list_for_each_entry ( msix, &d->arch.hvm.msix_tables, next ) > > + /* > > + * So far vPCI only traps accesses to the MSIX table, but not the > > PBA > > + * explicitly, and hence we only need to check for the hole > > created by > > + * the MSIX table. > > + * > > + * If the PBA table is also trapped, the check here should be > > expanded > > + * to take it into account. > > + */ > > + if ( VMSIX_ADDR_ADJACENT(addr, msix->pdev->vpci, VPCI_MSIX_TABLE) ) > > + return msix; > > Is the comment really correct when considering that you don't change > vpci_make_msix_hole()? Urg, I was really confused then, as I didn't remember (and didn't check) that we also punch a hole for the PBA. That's not really needed for dom0, as we allow both reads and writes on that case anyway. > (I was actually puzzled by struct vpci_msix'es > table[] field remaining a 2-element array, despite the PBA now being > dealt with differently. But I realize you need to keep that for the > VMSIX_ADDR_IN_RANGE() in adjacent_write().) If we need to handle the PBA I would need to take it into account for the array handling, since the PBA can be in a different set of page(s) than the MSIX table. > > +static int cf_check adjacent_read( > > + struct vcpu *v, unsigned long addr, unsigned int len, unsigned long > > *data) > > +{ > > + const struct domain *d = v->domain; > > + const struct vpci_msix *msix = adjacent_find(d, addr); > > + const void __iomem *mem; > > + paddr_t msix_tbl; > > + struct vpci *vpci; > > + > > + *data = ~0ul; > > + > > + if ( !msix ) > > + return X86EMUL_RETRY; > > + > > + vpci = msix->pdev->vpci; > > + msix_tbl = vmsix_table_addr(vpci, VPCI_MSIX_TABLE); > > + > > + if ( addr + len > round_pgup(msix_tbl + > > + vmsix_table_size(vpci, VPCI_MSIX_TABLE)) ) > > + return X86EMUL_OKAY; > > + > > + mem = get_table(vpci, > > + PFN_DOWN(addr) == PFN_DOWN(msix_tbl) ? > > VPCI_MSIX_TBL_HEAD > > + : > > VPCI_MSIX_TBL_TAIL); > > + if ( !mem ) > > + return X86EMUL_OKAY; > > The respective PBA logic had a gprintk() on this path. Ack, will add one. > > + switch ( len ) > > + { > > + case 1: > > + *data = readb(mem + PAGE_OFFSET(addr)); > > + break; > > + > > + case 2: > > + *data = readw(mem + PAGE_OFFSET(addr)); > > + break; > > + > > + case 4: > > + *data = readl(mem + PAGE_OFFSET(addr)); > > + break; > > + > > + case 8: > > + *data = readq(mem + PAGE_OFFSET(addr)); > > + break; > > So far we have allowed only aligned 4- and 8-byte accesses to the PBA. > Shouldn't we continue to do so? I've read the spec about this and wasn't able to find a reference about having to access the PBA using 4 and 8 byte accesses. There's one for the MSI-X table, but it's my understanding it only applies to the table. > I'm also concerned of misaligned accesses: While we can't keep the > guest from doing such on pages we don't intercept, depending on the kind > of anomalies such may cause the effects there may be contained to that > guest. When doing the accesses from the hypervisor, bad effects could > affect the entire system. (FTAOD I don't mean to constrain guests, but I > do think we need to consider splitting misaligned accesses.) I was also wondering about misaligned accesses. Should be allow dom0 any kind of access, while limiting domUs to aligned only? > > > + default: > > + ASSERT_UNREACHABLE(); > > Is this correct? In msix_{read,write}() these assertions are valid > because of the earlier access_allowed() checks, but here you have > nothing like that. Yes, the emulator currently would only pass sizes > that fit what is being handled, but relying on no "unusual" insns > appearing down the road feels risky. Then again > hvmemul_phys_mmio_access() splits accesses accordingly, so perhaps > all is fine here. It's only on debug builds, so on production ones the access would just be silently dropped without any ill effect on the hypervisor state. > > +static int cf_check adjacent_write( > > + struct vcpu *v, unsigned long addr, unsigned int len, unsigned long > > data) > > +{ > > + const struct domain *d = v->domain; > > + const struct vpci_msix *msix = adjacent_find(d, addr); > > + void __iomem *mem; > > + paddr_t msix_tbl; > > + struct vpci *vpci; > > + > > + if ( !msix ) > > + return X86EMUL_RETRY; > > + > > + vpci = msix->pdev->vpci; > > + msix_tbl = vmsix_table_addr(vpci, VPCI_MSIX_TABLE); > > + > > + if ( addr + len > round_pgup(msix_tbl + > > + vmsix_table_size(vpci, VPCI_MSIX_TABLE)) ) > > + return X86EMUL_OKAY; > > + > > + if ( (VMSIX_ADDR_IN_RANGE(addr, vpci, VPCI_MSIX_PBA) || > > + VMSIX_ADDR_IN_RANGE(addr + len - 1, vpci, VPCI_MSIX_PBA)) && > > + !is_hardware_domain(d) ) > > + /* Ignore writes to PBA for DomUs, it's undefined behavior. */ > > + return X86EMUL_OKAY; > > Just as a remark: Checking only start and end is sufficient merely because > the PBA is a multiple of 8 bytes in size, and "len" currently cannot be > larger than 8. This feels somewhat fragile, but is - like above - presumably > okay. Indeed, I was relying on len <= min PBA size. I can add a comment to clarify this. > > @@ -530,7 +600,10 @@ static int cf_check init_msix(struct pci_dev *pdev) > > } > > > > if ( list_empty(&d->arch.hvm.msix_tables) ) > > + { > > register_mmio_handler(d, &vpci_msix_table_ops); > > + register_mmio_handler(d, &vpci_msix_adj_ops); > > + } > > Did you consider re-using the same ops by widening what their accept() > covers, and by having read/write recognize inside vs outside accesses, > dealing with them differently (much like the PBA was dealt with before)? I had the feeling it would be clearer to have the MSIX handler only deal with the MSIX table accesses (and so have it's scope properly limited in the accept hook), and deal with the fallout from having to poke a 4K hole in the physmap using a different handler. I will play a bit with both options and see which one I prefer. > Besides my gut feeling of this ending up being less code, there's also > the aspect of NR_IO_HANDLERS being the upper bound to how many handlers > may be registered. > > Or else did you consider registering this further handler only when > there actually is a device where the MSI-X table has "slack" at the > front and/or end? Hm, no, I didn't consider this conditional registering. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |