[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/3] x86/hvm: Allow writes to registers on the same page as MSI-X table
On Tue, Mar 28, 2023 at 01:28:44PM +0200, Roger Pau Monné wrote: > On Sat, Mar 25, 2023 at 03:49:23AM +0100, Marek Marczykowski-Górecki wrote: > > Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers > > on the same page as MSI-X table. Device model (especially one in > > stubdomain) cannot really handle those, as direct writes to that page is > > refused (page is on the mmio_ro_ranges list). Instead, add internal ioreq > > server that handle those writes. > > > > Doing this, requires correlating write location with guest view > > of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest, > > it requires msixtbl_entry->gtable, which is HVM-only. Similar feature > > for PV would need to be done separately. > > > > This can be also used to read Pending Bit Array, if it lives on the same > > page, making QEMU not needing /dev/mem access at all (especially helpful > > with lockdown enabled in dom0). If PBA lives on another page, QEMU will > > map it to the guest directly. > > If PBA lives on the same page, forbid writes and log a message. > > Technically, writes outside of PBA could be allowed, but at this moment > > the precise location of PBA isn't saved. > > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > > --- > > v2: > > - adjust commit message > > - pass struct domain to msixtbl_page_handler_get_hwaddr() > > - reduce local variables used only once > > - log a warning if write is forbidden if MSI-X and PBA lives on the same > > page > > - do not passthrough unaligned accesses > > - handle accesses both before and after MSI-X table > > --- > > xen/arch/x86/hvm/vmsi.c | 154 +++++++++++++++++++++++++++++++++ > > xen/arch/x86/include/asm/msi.h | 5 ++ > > xen/arch/x86/msi.c | 38 ++++++++ > > 3 files changed, 197 insertions(+) > > > > diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c > > index 9c82bf9b4ec2..9293009a4075 100644 > > --- a/xen/arch/x86/hvm/vmsi.c > > +++ b/xen/arch/x86/hvm/vmsi.c > > @@ -438,6 +438,152 @@ static const struct hvm_io_ops msixtbl_mmio_ops = { > > .write = _msixtbl_write, > > }; > > > > +static void __iomem *msixtbl_page_handler_get_hwaddr( > > + const struct domain *d, > > + uint64_t address, > > + bool write) > > +{ > > + const struct pci_dev *pdev = NULL; > > + const struct msixtbl_entry *entry; > > + int adj_type; > > unsigned AFAICT. Ok. > > + > > + rcu_read_lock(&msixtbl_rcu_lock); > > + /* > > + * Check if it's on the same page as the end of the MSI-X table, but > > + * outside of the table itself. > > + */ > > + list_for_each_entry( entry, &d->arch.hvm.msixtbl_list, list ) > > + { > > + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable) && > > + address < entry->gtable ) > > + { > > + adj_type = ADJ_IDX_FIRST; > > + pdev = entry->pdev; > > + break; > > + } > > + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable + > > entry->table_len) && > > Should be entry->gtable + entry->table_len - 1, or else you are > off-by-one. Ok. > > + address >= entry->gtable + entry->table_len ) > > + { > > + adj_type = ADJ_IDX_LAST; > > + pdev = entry->pdev; > > + break; > > + } > > + } > > + rcu_read_unlock(&msixtbl_rcu_lock); > > + > > + if ( !pdev ) > > + return NULL; > > + > > + ASSERT(pdev->msix); > > + > > + if ( !pdev->msix->adj_access_table_idx[adj_type] ) > > + { > > + gdprintk(XENLOG_WARNING, > > + "Page for adjacent MSI-X table access not initialized for > > %pp\n", > > + pdev); > > + > > + return NULL; > > + } > > + > > + /* If PBA lives on the same page too, forbid writes. */ > > + if ( write && ( > > + (adj_type == ADJ_IDX_LAST && > > + pdev->msix->table.last == pdev->msix->pba.first) || > > + (adj_type == ADJ_IDX_FIRST && > > + pdev->msix->table.first == pdev->msix->pba.last)) ) > > + { > > + gdprintk(XENLOG_WARNING, > > + "MSI-X table and PBA of %pp live on the same page, " > > + "writing to other registers there is not implemented\n", > > + pdev); > > Don't you actually need to check that the passed address falls into the > PBA array? PBA can be sharing the same page as the MSI-X table, but > that doesn't imply there aren't holes also not used by either the PBA > or the MSI-X table in such page. I don't know exact location of PBA, so I'm rejecting writes just in case they do hit PBA (see commit message). > > + return NULL; > > + } > > + > > + return fix_to_virt(pdev->msix->adj_access_table_idx[adj_type]) + > > + (address & (PAGE_SIZE - 1)); > > You can use PAGE_OFFSET(). Ok. > > + > > +} > > + > > +static bool cf_check msixtbl_page_accept( > > + const struct hvm_io_handler *handler, const ioreq_t *r) > > +{ > > + ASSERT(r->type == IOREQ_TYPE_COPY); > > + > > + return msixtbl_page_handler_get_hwaddr( > > + current->domain, r->addr, r->dir == IOREQ_WRITE); > > I think you want to accept it also if it's a write to the PBA, and > just drop it. You should always pass write=false and then drop it in > msixtbl_page_write() if it falls in the PBA region (but still return > X86EMUL_OKAY). I don't want to interfere with msixtbl_mmio_page_ops, this handler is only about accesses not hitting actual MSI-X structures. > > +} > > + > > +static int cf_check msixtbl_page_read( > > + const struct hvm_io_handler *handler, > > + uint64_t address, uint32_t len, uint64_t *pval) > > Why use hvm_io_ops and not hvm_mmio_ops? You only care about MMIO > accesses here. I followed how msixtbl_mmio_ops are registered. Should that also be changed to hvm_mmio_ops? > > > +{ > > + void __iomem *hwaddr; > > const > > I would also initialize *pval = ~0ul for safety. When returning X86EMUL_OKAY, then I agree. > > + > > + if ( address & (len - 1) ) > > + return X86EMUL_UNHANDLEABLE; > > You can use IS_ALIGNED for clarity. In my fix for this for vPCI Jan > asked to split unaligned accesses into 1byte ones, but I think for > guests it's better to just drop them unless there's a specific case > that we need to handle. > > Also you should return X86EMUL_OKAY here, as the address is handled > here, but the current way to handle it is to drop the access. > > Printing a debug message can be helpful in case unaligned accesses > need to be implemented in order to support some device. > > > + > > + hwaddr = msixtbl_page_handler_get_hwaddr( > > + current->domain, address, false); > > + > > + if ( !hwaddr ) > > + return X86EMUL_UNHANDLEABLE; > > Maybe X86EMUL_RETRY, since the page was there in the accept handler. How does X86EMUL_RETRY work? Is it trying to find the handler again? > > + > > + switch ( len ) > > + { > > + case 1: > > + *pval = readb(hwaddr); > > + break; > > + case 2: > > + *pval = readw(hwaddr); > > + break; > > + case 4: > > + *pval = readl(hwaddr); > > + break; > > + case 8: > > + *pval = readq(hwaddr); > > + break; > > Nit: we usually add a newline after every break; Ok. > > + default: > > + return X86EMUL_UNHANDLEABLE; > > I would rather use ASSERT_UNREACHABLE() here and fallthrough to the > return below. Seeing such size is likely an indication of something > else gone very wrong, better to just terminate the access at once. > > > + } > > + return X86EMUL_OKAY; > > +} > > + > > +static int cf_check msixtbl_page_write( > > + const struct hvm_io_handler *handler, > > + uint64_t address, uint32_t len, uint64_t val) > > +{ > > + void __iomem *hwaddr = msixtbl_page_handler_get_hwaddr( > > + current->domain, address, true); > > + > > You don't seem to check whether the access is aligned here? > > Otherwise I have mostly the same comments as in msixtbl_page_read(). Ok. > > + if ( !hwaddr ) > > + return X86EMUL_UNHANDLEABLE; > > + > > + switch ( len ) { > > + case 1: > > + writeb(val, hwaddr); > > + break; > > + case 2: > > + writew(val, hwaddr); > > + break; > > + case 4: > > + writel(val, hwaddr); > > + break; > > + case 8: > > + writeq(val, hwaddr); > > + break; > > + default: > > + return X86EMUL_UNHANDLEABLE; > > + } > > + return X86EMUL_OKAY; > > + > > +} > > + > > +static const struct hvm_io_ops msixtbl_mmio_page_ops = { > > + .accept = msixtbl_page_accept, > > + .read = msixtbl_page_read, > > + .write = msixtbl_page_write, > > +}; > > + > > static void add_msixtbl_entry(struct domain *d, > > struct pci_dev *pdev, > > uint64_t gtable, > > @@ -593,6 +739,14 @@ void msixtbl_init(struct domain *d) > > handler->type = IOREQ_TYPE_COPY; > > handler->ops = &msixtbl_mmio_ops; > > } > > + > > + /* passthrough access to other registers on the same page */ > > + handler = hvm_next_io_handler(d); > > + if ( handler ) > > + { > > + handler->type = IOREQ_TYPE_COPY; > > + handler->ops = &msixtbl_mmio_page_ops; > > + } > > } > > > > void msixtbl_pt_cleanup(struct domain *d) > > diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h > > index a53ade95c9ad..d13cf1c1f873 100644 > > --- a/xen/arch/x86/include/asm/msi.h > > +++ b/xen/arch/x86/include/asm/msi.h > > @@ -207,6 +207,10 @@ struct msg_address { > > PCI_MSIX_ENTRY_SIZE + \ > > (~PCI_MSIX_BIRMASK & (PAGE_SIZE - > > 1))) > > > > +/* indexes in adj_access_table_idx[] below */ > > +#define ADJ_IDX_FIRST 0 > > +#define ADJ_IDX_LAST 1 > > + > > struct arch_msix { > > unsigned int nr_entries, used_entries; > > struct { > > @@ -214,6 +218,7 @@ struct arch_msix { > > } table, pba; > > int table_refcnt[MAX_MSIX_TABLE_PAGES]; > > int table_idx[MAX_MSIX_TABLE_PAGES]; > > + int adj_access_table_idx[2]; > > spinlock_t table_lock; > > bool host_maskall, guest_maskall; > > domid_t warned; > > diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c > > index d0bf63df1def..680853f84685 100644 > > --- a/xen/arch/x86/msi.c > > +++ b/xen/arch/x86/msi.c > > @@ -961,6 +961,34 @@ static int msix_capability_init(struct pci_dev *dev, > > domain_crash(d); > > /* XXX How to deal with existing mappings? */ > > } > > + > > + /* > > + * If the MSI-X table doesn't start at the page boundary, map the > > first page for > > + * passthrough accesses. > > + */ > > I think you should initialize > msix->adj_access_table_idx[ADJ_IDX_{FIRST,LAST}] to -1? > > > + if ( table_paddr & (PAGE_SIZE - 1) ) > > + { > > + int idx = msix_get_fixmap(msix, table_paddr, table_paddr); > > + > > + if ( idx >= 0 ) > > + msix->adj_access_table_idx[ADJ_IDX_FIRST] = idx; > > + else > > + gprintk(XENLOG_ERR, "Failed to map first MSI-X table page: > > %d\n", idx); > > + } > > + /* > > + * If the MSI-X table doesn't span full page(s), map the last page > > for > > + * passthrough accesses. > > + */ > > + if ( (table_paddr + msix->nr_entries * PCI_MSIX_ENTRY_SIZE) & > > (PAGE_SIZE - 1) ) > > + { > > + uint64_t entry_paddr = table_paddr + msix->nr_entries * > > PCI_MSIX_ENTRY_SIZE; > > + int idx = msix_get_fixmap(msix, table_paddr, entry_paddr); > > + > > + if ( idx >= 0 ) > > + msix->adj_access_table_idx[ADJ_IDX_LAST] = idx; > > + else > > + gprintk(XENLOG_ERR, "Failed to map last MSI-X table page: > > %d\n", idx); > > + } > > } > > WARN_ON(msix->table.first != (table_paddr >> PAGE_SHIFT)); > > ++msix->used_entries; > > @@ -1090,6 +1118,16 @@ static void _pci_cleanup_msix(struct arch_msix *msix) > > WARN(); > > msix->table.first = 0; > > msix->table.last = 0; > > + if ( msix->adj_access_table_idx[ADJ_IDX_FIRST] ) > > + { > > + msix_put_fixmap(msix, > > msix->adj_access_table_idx[ADJ_IDX_FIRST]); > > + msix->adj_access_table_idx[ADJ_IDX_FIRST] = 0; > > + } > > + if ( msix->adj_access_table_idx[ADJ_IDX_LAST] ) > > + { > > + msix_put_fixmap(msix, > > msix->adj_access_table_idx[ADJ_IDX_LAST]); > > + msix->adj_access_table_idx[ADJ_IDX_LAST] = 0; > > Isn't 0 a valid idx? You should check for >= 0 and also set > to -1 once the fixmap entry has been put. I rely here on msix using specific range of fixmap indexes (FIX_MSIX_TO_RESERV_BASE - FIX_MSIX_TO_RESERV_END), which isn't starting at 0. For this reason also, I keep unused entries at 0 (no need explicit initialization). -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |