[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/3] x86/hvm: Allow writes to registers on the same page as MSI-X table
On Sat, Mar 25, 2023 at 03:49:23AM +0100, Marek Marczykowski-Górecki wrote: > Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers > on the same page as MSI-X table. Device model (especially one in > stubdomain) cannot really handle those, as direct writes to that page is > refused (page is on the mmio_ro_ranges list). Instead, add internal ioreq > server that handle those writes. > > Doing this, requires correlating write location with guest view > of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest, > it requires msixtbl_entry->gtable, which is HVM-only. Similar feature > for PV would need to be done separately. > > This can be also used to read Pending Bit Array, if it lives on the same > page, making QEMU not needing /dev/mem access at all (especially helpful > with lockdown enabled in dom0). If PBA lives on another page, QEMU will > map it to the guest directly. > If PBA lives on the same page, forbid writes and log a message. > Technically, writes outside of PBA could be allowed, but at this moment > the precise location of PBA isn't saved. > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > --- > v2: > - adjust commit message > - pass struct domain to msixtbl_page_handler_get_hwaddr() > - reduce local variables used only once > - log a warning if write is forbidden if MSI-X and PBA lives on the same > page > - do not passthrough unaligned accesses > - handle accesses both before and after MSI-X table > --- > xen/arch/x86/hvm/vmsi.c | 154 +++++++++++++++++++++++++++++++++ > xen/arch/x86/include/asm/msi.h | 5 ++ > xen/arch/x86/msi.c | 38 ++++++++ > 3 files changed, 197 insertions(+) > > diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c > index 9c82bf9b4ec2..9293009a4075 100644 > --- a/xen/arch/x86/hvm/vmsi.c > +++ b/xen/arch/x86/hvm/vmsi.c > @@ -438,6 +438,152 @@ static const struct hvm_io_ops msixtbl_mmio_ops = { > .write = _msixtbl_write, > }; > > +static void __iomem *msixtbl_page_handler_get_hwaddr( > + const struct domain *d, > + uint64_t address, > + bool write) > +{ > + const struct pci_dev *pdev = NULL; > + const struct msixtbl_entry *entry; > + int adj_type; unsigned AFAICT. > + > + rcu_read_lock(&msixtbl_rcu_lock); > + /* > + * Check if it's on the same page as the end of the MSI-X table, but > + * outside of the table itself. > + */ > + list_for_each_entry( entry, &d->arch.hvm.msixtbl_list, list ) > + { > + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable) && > + address < entry->gtable ) > + { > + adj_type = ADJ_IDX_FIRST; > + pdev = entry->pdev; > + break; > + } > + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable + entry->table_len) > && Should be entry->gtable + entry->table_len - 1, or else you are off-by-one. > + address >= entry->gtable + entry->table_len ) > + { > + adj_type = ADJ_IDX_LAST; > + pdev = entry->pdev; > + break; > + } > + } > + rcu_read_unlock(&msixtbl_rcu_lock); > + > + if ( !pdev ) > + return NULL; > + > + ASSERT(pdev->msix); > + > + if ( !pdev->msix->adj_access_table_idx[adj_type] ) > + { > + gdprintk(XENLOG_WARNING, > + "Page for adjacent MSI-X table access not initialized for > %pp\n", > + pdev); > + > + return NULL; > + } > + > + /* If PBA lives on the same page too, forbid writes. */ > + if ( write && ( > + (adj_type == ADJ_IDX_LAST && > + pdev->msix->table.last == pdev->msix->pba.first) || > + (adj_type == ADJ_IDX_FIRST && > + pdev->msix->table.first == pdev->msix->pba.last)) ) > + { > + gdprintk(XENLOG_WARNING, > + "MSI-X table and PBA of %pp live on the same page, " > + "writing to other registers there is not implemented\n", > + pdev); Don't you actually need to check that the passed address falls into the PBA array? PBA can be sharing the same page as the MSI-X table, but that doesn't imply there aren't holes also not used by either the PBA or the MSI-X table in such page. > + return NULL; > + } > + > + return fix_to_virt(pdev->msix->adj_access_table_idx[adj_type]) + > + (address & (PAGE_SIZE - 1)); You can use PAGE_OFFSET(). > + > +} > + > +static bool cf_check msixtbl_page_accept( > + const struct hvm_io_handler *handler, const ioreq_t *r) > +{ > + ASSERT(r->type == IOREQ_TYPE_COPY); > + > + return msixtbl_page_handler_get_hwaddr( > + current->domain, r->addr, r->dir == IOREQ_WRITE); I think you want to accept it also if it's a write to the PBA, and just drop it. You should always pass write=false and then drop it in msixtbl_page_write() if it falls in the PBA region (but still return X86EMUL_OKAY). > +} > + > +static int cf_check msixtbl_page_read( > + const struct hvm_io_handler *handler, > + uint64_t address, uint32_t len, uint64_t *pval) Why use hvm_io_ops and not hvm_mmio_ops? You only care about MMIO accesses here. > +{ > + void __iomem *hwaddr; const I would also initialize *pval = ~0ul for safety. > + > + if ( address & (len - 1) ) > + return X86EMUL_UNHANDLEABLE; You can use IS_ALIGNED for clarity. In my fix for this for vPCI Jan asked to split unaligned accesses into 1byte ones, but I think for guests it's better to just drop them unless there's a specific case that we need to handle. Also you should return X86EMUL_OKAY here, as the address is handled here, but the current way to handle it is to drop the access. Printing a debug message can be helpful in case unaligned accesses need to be implemented in order to support some device. > + > + hwaddr = msixtbl_page_handler_get_hwaddr( > + current->domain, address, false); > + > + if ( !hwaddr ) > + return X86EMUL_UNHANDLEABLE; Maybe X86EMUL_RETRY, since the page was there in the accept handler. > + > + switch ( len ) > + { > + case 1: > + *pval = readb(hwaddr); > + break; > + case 2: > + *pval = readw(hwaddr); > + break; > + case 4: > + *pval = readl(hwaddr); > + break; > + case 8: > + *pval = readq(hwaddr); > + break; Nit: we usually add a newline after every break; > + default: > + return X86EMUL_UNHANDLEABLE; I would rather use ASSERT_UNREACHABLE() here and fallthrough to the return below. Seeing such size is likely an indication of something else gone very wrong, better to just terminate the access at once. > + } > + return X86EMUL_OKAY; > +} > + > +static int cf_check msixtbl_page_write( > + const struct hvm_io_handler *handler, > + uint64_t address, uint32_t len, uint64_t val) > +{ > + void __iomem *hwaddr = msixtbl_page_handler_get_hwaddr( > + current->domain, address, true); > + You don't seem to check whether the access is aligned here? Otherwise I have mostly the same comments as in msixtbl_page_read(). > + if ( !hwaddr ) > + return X86EMUL_UNHANDLEABLE; > + > + switch ( len ) { > + case 1: > + writeb(val, hwaddr); > + break; > + case 2: > + writew(val, hwaddr); > + break; > + case 4: > + writel(val, hwaddr); > + break; > + case 8: > + writeq(val, hwaddr); > + break; > + default: > + return X86EMUL_UNHANDLEABLE; > + } > + return X86EMUL_OKAY; > + > +} > + > +static const struct hvm_io_ops msixtbl_mmio_page_ops = { > + .accept = msixtbl_page_accept, > + .read = msixtbl_page_read, > + .write = msixtbl_page_write, > +}; > + > static void add_msixtbl_entry(struct domain *d, > struct pci_dev *pdev, > uint64_t gtable, > @@ -593,6 +739,14 @@ void msixtbl_init(struct domain *d) > handler->type = IOREQ_TYPE_COPY; > handler->ops = &msixtbl_mmio_ops; > } > + > + /* passthrough access to other registers on the same page */ > + handler = hvm_next_io_handler(d); > + if ( handler ) > + { > + handler->type = IOREQ_TYPE_COPY; > + handler->ops = &msixtbl_mmio_page_ops; > + } > } > > void msixtbl_pt_cleanup(struct domain *d) > diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h > index a53ade95c9ad..d13cf1c1f873 100644 > --- a/xen/arch/x86/include/asm/msi.h > +++ b/xen/arch/x86/include/asm/msi.h > @@ -207,6 +207,10 @@ struct msg_address { > PCI_MSIX_ENTRY_SIZE + \ > (~PCI_MSIX_BIRMASK & (PAGE_SIZE - 1))) > > +/* indexes in adj_access_table_idx[] below */ > +#define ADJ_IDX_FIRST 0 > +#define ADJ_IDX_LAST 1 > + > struct arch_msix { > unsigned int nr_entries, used_entries; > struct { > @@ -214,6 +218,7 @@ struct arch_msix { > } table, pba; > int table_refcnt[MAX_MSIX_TABLE_PAGES]; > int table_idx[MAX_MSIX_TABLE_PAGES]; > + int adj_access_table_idx[2]; > spinlock_t table_lock; > bool host_maskall, guest_maskall; > domid_t warned; > diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c > index d0bf63df1def..680853f84685 100644 > --- a/xen/arch/x86/msi.c > +++ b/xen/arch/x86/msi.c > @@ -961,6 +961,34 @@ static int msix_capability_init(struct pci_dev *dev, > domain_crash(d); > /* XXX How to deal with existing mappings? */ > } > + > + /* > + * If the MSI-X table doesn't start at the page boundary, map the > first page for > + * passthrough accesses. > + */ I think you should initialize msix->adj_access_table_idx[ADJ_IDX_{FIRST,LAST}] to -1? > + if ( table_paddr & (PAGE_SIZE - 1) ) > + { > + int idx = msix_get_fixmap(msix, table_paddr, table_paddr); > + > + if ( idx >= 0 ) > + msix->adj_access_table_idx[ADJ_IDX_FIRST] = idx; > + else > + gprintk(XENLOG_ERR, "Failed to map first MSI-X table page: > %d\n", idx); > + } > + /* > + * If the MSI-X table doesn't span full page(s), map the last page > for > + * passthrough accesses. > + */ > + if ( (table_paddr + msix->nr_entries * PCI_MSIX_ENTRY_SIZE) & > (PAGE_SIZE - 1) ) > + { > + uint64_t entry_paddr = table_paddr + msix->nr_entries * > PCI_MSIX_ENTRY_SIZE; > + int idx = msix_get_fixmap(msix, table_paddr, entry_paddr); > + > + if ( idx >= 0 ) > + msix->adj_access_table_idx[ADJ_IDX_LAST] = idx; > + else > + gprintk(XENLOG_ERR, "Failed to map last MSI-X table page: > %d\n", idx); > + } > } > WARN_ON(msix->table.first != (table_paddr >> PAGE_SHIFT)); > ++msix->used_entries; > @@ -1090,6 +1118,16 @@ static void _pci_cleanup_msix(struct arch_msix *msix) > WARN(); > msix->table.first = 0; > msix->table.last = 0; > + if ( msix->adj_access_table_idx[ADJ_IDX_FIRST] ) > + { > + msix_put_fixmap(msix, msix->adj_access_table_idx[ADJ_IDX_FIRST]); > + msix->adj_access_table_idx[ADJ_IDX_FIRST] = 0; > + } > + if ( msix->adj_access_table_idx[ADJ_IDX_LAST] ) > + { > + msix_put_fixmap(msix, msix->adj_access_table_idx[ADJ_IDX_LAST]); > + msix->adj_access_table_idx[ADJ_IDX_LAST] = 0; Isn't 0 a valid idx? You should check for >= 0 and also set to -1 once the fixmap entry has been put. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |