Xen project Mailing List

Re: [PATCH] vpci/msix: handle accesses adjacent to the MSI-X table

To: Roger Pau Monne <roger.pau@xxxxxxxxxx>

Date: Tue, 14 Mar 2023 12:56:33 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CycMpsj2bFwh2XWHtApbCxJag+E25ba/wqRdhNUKzPs=; b=eyT6c8+x9PRQMwa2kCyDaaYBEdjIRVJljyu4w59fBx38yWAFu80tLG5qOfZ068DzUM7im0u7QbO34LcWapRkqyaNzSsZkvu+yIOoxHCK1DlJyOg5DkaNaPyFGC5JFl5OJJiUNwVkSaowjhGbEaumpiXhn+FT244GOO3vH1WI0dQ69bHAbMKDPTWQKLAh2dXeq+wNXcQtS/RXlD2i7kreozL1XjyHRVC2nCTtmiVDIlIfVg/B/RIxOAnGrPUqZLRJSF0FWhZLWJ/57lyNzs5WjwtUtcIocgYQ+XQvVuaC3DYbeoR1QdsigVdwfJCWAC9QcvfvgaTkoLYQpEUqKWeLaQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JmYy4nmZp0JBcl/uTtbQzy04GRPucTEOfkuDBtVYGXGSOy6vGqTb+gnMUkuAMEoA6XMRnUGAjDNdEUxxjcdkZLO1DCUEZf9ESZcllr3Q1OE65SeC95aNnEbDzQmiK9hgrGb/TVipFhR9DNblK+/w5vlgBeJzBCyc0CainmDr6TV+rscseTR5Qrcb2ADv4dh4a3od2FBv2qqoWQOB3hLzE2eRSbXJjgIiwmx5EgnD4KzBmFhQy/edF6AdjL1gd45QwPJXs9gJX5narh9ZJUGw88EsugU2WAKjsGefDtzLjaUeGBw5oWxA0jsbqfzCXcnunoWEUK3ev4P4Ekd2Pt5P4g==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 14 Mar 2023 11:56:57 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14.03.2023 11:13, Roger Pau Monne wrote: > --- a/xen/drivers/vpci/msix.c > +++ b/xen/drivers/vpci/msix.c > @@ -27,6 +27,13 @@ > ((addr) >= vmsix_table_addr(vpci, nr) && \ > (addr) < vmsix_table_addr(vpci, nr) + vmsix_table_size(vpci, nr)) > > +#define VMSIX_ADDR_ADJACENT(addr, vpci, nr) \ > + ((PFN_DOWN(addr) == PFN_DOWN(vmsix_table_addr(vpci, nr)) && \ > + (addr) < vmsix_table_addr(vpci, nr)) || \ > + (PFN_DOWN(addr) == PFN_DOWN(vmsix_table_addr(vpci, nr) + \ > + vmsix_table_size(vpci, nr) - 1) && \ > + (addr) >= vmsix_table_addr(vpci, nr) + vmsix_table_size(vpci, nr))) While I realize this may impact performance a little, did you consider using !VMSIX_ADDR_IN_RANGE() instead of open-coding it kind of? Then again there's only a single use of the macro, and that's in code where VMSIX_ADDR_IN_RANGE() was already checked (by the earlier invocation of msix_find()), so the re-checking of the MSI-X table bounds isn't strictly necessary anyway. > @@ -438,6 +369,145 @@ static const struct hvm_mmio_ops vpci_msix_table_ops = { > .write = msix_write, > }; > > +const static struct vpci_msix *adjacent_find(const struct domain *d, > + unsigned long addr) > +{ > + const struct vpci_msix *msix; > + > + list_for_each_entry ( msix, &d->arch.hvm.msix_tables, next ) > + /* > + * So far vPCI only traps accesses to the MSIX table, but not the PBA > + * explicitly, and hence we only need to check for the hole created > by > + * the MSIX table. > + * > + * If the PBA table is also trapped, the check here should be > expanded > + * to take it into account. > + */ > + if ( VMSIX_ADDR_ADJACENT(addr, msix->pdev->vpci, VPCI_MSIX_TABLE) ) > + return msix; Is the comment really correct when considering that you don't change vpci_make_msix_hole()? (I was actually puzzled by struct vpci_msix'es table[] field remaining a 2-element array, despite the PBA now being dealt with differently. But I realize you need to keep that for the VMSIX_ADDR_IN_RANGE() in adjacent_write().) > +static int cf_check adjacent_read( > + struct vcpu *v, unsigned long addr, unsigned int len, unsigned long > *data) > +{ > + const struct domain *d = v->domain; > + const struct vpci_msix *msix = adjacent_find(d, addr); > + const void __iomem *mem; > + paddr_t msix_tbl; > + struct vpci *vpci; > + > + *data = ~0ul; > + > + if ( !msix ) > + return X86EMUL_RETRY; > + > + vpci = msix->pdev->vpci; > + msix_tbl = vmsix_table_addr(vpci, VPCI_MSIX_TABLE); > + > + if ( addr + len > round_pgup(msix_tbl + > + vmsix_table_size(vpci, VPCI_MSIX_TABLE)) ) > + return X86EMUL_OKAY; > + > + mem = get_table(vpci, > + PFN_DOWN(addr) == PFN_DOWN(msix_tbl) ? VPCI_MSIX_TBL_HEAD > + : > VPCI_MSIX_TBL_TAIL); > + if ( !mem ) > + return X86EMUL_OKAY; The respective PBA logic had a gprintk() on this path. > + switch ( len ) > + { > + case 1: > + *data = readb(mem + PAGE_OFFSET(addr)); > + break; > + > + case 2: > + *data = readw(mem + PAGE_OFFSET(addr)); > + break; > + > + case 4: > + *data = readl(mem + PAGE_OFFSET(addr)); > + break; > + > + case 8: > + *data = readq(mem + PAGE_OFFSET(addr)); > + break; So far we have allowed only aligned 4- and 8-byte accesses to the PBA. Shouldn't we continue to do so? I'm also concerned of misaligned accesses: While we can't keep the guest from doing such on pages we don't intercept, depending on the kind of anomalies such may cause the effects there may be contained to that guest. When doing the accesses from the hypervisor, bad effects could affect the entire system. (FTAOD I don't mean to constrain guests, but I do think we need to consider splitting misaligned accesses.) > + default: > + ASSERT_UNREACHABLE(); Is this correct? In msix_{read,write}() these assertions are valid because of the earlier access_allowed() checks, but here you have nothing like that. Yes, the emulator currently would only pass sizes that fit what is being handled, but relying on no "unusual" insns appearing down the road feels risky. Then again hvmemul_phys_mmio_access() splits accesses accordingly, so perhaps all is fine here. > +static int cf_check adjacent_write( > + struct vcpu *v, unsigned long addr, unsigned int len, unsigned long data) > +{ > + const struct domain *d = v->domain; > + const struct vpci_msix *msix = adjacent_find(d, addr); > + void __iomem *mem; > + paddr_t msix_tbl; > + struct vpci *vpci; > + > + if ( !msix ) > + return X86EMUL_RETRY; > + > + vpci = msix->pdev->vpci; > + msix_tbl = vmsix_table_addr(vpci, VPCI_MSIX_TABLE); > + > + if ( addr + len > round_pgup(msix_tbl + > + vmsix_table_size(vpci, VPCI_MSIX_TABLE)) ) > + return X86EMUL_OKAY; > + > + if ( (VMSIX_ADDR_IN_RANGE(addr, vpci, VPCI_MSIX_PBA) || > + VMSIX_ADDR_IN_RANGE(addr + len - 1, vpci, VPCI_MSIX_PBA)) && > + !is_hardware_domain(d) ) > + /* Ignore writes to PBA for DomUs, it's undefined behavior. */ > + return X86EMUL_OKAY; Just as a remark: Checking only start and end is sufficient merely because the PBA is a multiple of 8 bytes in size, and "len" currently cannot be larger than 8. This feels somewhat fragile, but is - like above - presumably okay. > @@ -530,7 +600,10 @@ static int cf_check init_msix(struct pci_dev *pdev) > } > > if ( list_empty(&d->arch.hvm.msix_tables) ) > + { > register_mmio_handler(d, &vpci_msix_table_ops); > + register_mmio_handler(d, &vpci_msix_adj_ops); > + } Did you consider re-using the same ops by widening what their accept() covers, and by having read/write recognize inside vs outside accesses, dealing with them differently (much like the PBA was dealt with before)? Besides my gut feeling of this ending up being less code, there's also the aspect of NR_IO_HANDLERS being the upper bound to how many handlers may be registered. Or else did you consider registering this further handler only when there actually is a device where the MSI-X table has "slack" at the front and/or end? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.