[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 2/6] x86/hvm: Allow access to registers on the same page as MSI-X table
On 24.11.2023 02:47, Marek Marczykowski-Górecki wrote: > Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers > on the same page as MSI-X table. Device model (especially one in > stubdomain) cannot really handle those, as direct writes to that page is > refused (page is on the mmio_ro_ranges list). Instead, extend > msixtbl_mmio_ops to handle such accesses too. > > Doing this, requires correlating read/write location with guest > of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest, > it requires msixtbl_entry->gtable, which is HVM-only. Similar feature > for PV would need to be done separately. > > This will be also used to read Pending Bit Array, if it lives on the same > page, making QEMU not needing /dev/mem access at all (especially helpful > with lockdown enabled in dom0). If PBA lives on another page, QEMU will > map it to the guest directly. > If PBA lives on the same page, discard writes and log a message. > Technically, writes outside of PBA could be allowed, but at this moment > the precise location of PBA isn't saved, and also no known device abuses > the spec in this way (at least yet). > > To access those registers, msixtbl_mmio_ops need the relevant page > mapped. MSI handling already has infrastructure for that, using fixmap, > so try to map first/last page of the MSI-X table (if necessary) and save > their fixmap indexes. Note that msix_get_fixmap() does reference > counting and reuses existing mapping, so just call it directly, even if > the page was mapped before. Also, it uses a specific range of fixmap > indexes which doesn't include 0, so use 0 as default ("not mapped") > value - which simplifies code a bit. > > GCC gets confused about 'desc' variable: > > arch/x86/hvm/vmsi.c: In function ‘msixtbl_range’: > arch/x86/hvm/vmsi.c:553:8: error: ‘desc’ may be used uninitialized > [-Werror=maybe-uninitialized] > 553 | if ( desc ) > | ^ > arch/x86/hvm/vmsi.c:537:28: note: ‘desc’ was declared here > 537 | const struct msi_desc *desc; > | ^~~~ This could do with also indicating the gcc version. Issues like this tend to get fixed over time. > It's conditional initialization is actually correct (in the case where > it isn't initialized, function returns early), but to avoid > build failure initialize it explicitly to NULL anyway. > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> Logic looks okay to me now (albeit I'm still not overly happy that we need to gain such code), but there are a couple of cosmetic issues. > @@ -213,6 +217,131 @@ static struct msi_desc *msixtbl_addr_to_desc( > return NULL; > } > > +/* > + * Returns: > + * - UINT_MAX if no handling should be done > + * - UINT_MAX-1 if write should be discarded > + * - a fixmap idx to use for handling > + */ > +#define ADJACENT_DONT_HANDLE UINT_MAX > +#define ADJACENT_DISCARD_WRITE (UINT_MAX - 1) The comment would imo better talk in terms of the two constants you define here. > +static unsigned int adjacent_handle( > + const struct msixtbl_entry *entry, unsigned long addr, bool write) > +{ > + unsigned int adj_type; > + const struct arch_msix *msix; > + > + if ( !entry || !entry->pdev ) > + return ADJACENT_DONT_HANDLE; > + > + if ( PFN_DOWN(addr) == PFN_DOWN(entry->gtable) && addr < entry->gtable ) > + adj_type = ADJ_IDX_FIRST; > + else if ( PFN_DOWN(addr) == PFN_DOWN(entry->gtable + entry->table_len - > 1) && > + addr >= entry->gtable + entry->table_len ) > + adj_type = ADJ_IDX_LAST; > + else > + return ADJACENT_DONT_HANDLE; > + > + msix = entry->pdev->msix; > + ASSERT(msix); > + > + if ( !msix->adj_access_idx[adj_type] ) > + { > + gprintk(XENLOG_WARNING, > + "Page for adjacent(%d) MSI-X table access not initialized > for %pp (addr %#lx, gtable %#lx\n", > + adj_type, &entry->pdev->sbdf, addr, entry->gtable); > + > + return ADJACENT_DONT_HANDLE; > + } > + > + /* If PBA lives on the same page too, discard writes. */ > + if ( write && > + ((adj_type == ADJ_IDX_LAST && > + msix->table.last == msix->pba.first) || > + (adj_type == ADJ_IDX_FIRST && > + msix->table.first == msix->pba.last)) ) > + { > + gprintk(XENLOG_WARNING, > + "MSI-X table and PBA of %pp live on the same page, " > + "writing to other registers there is not implemented\n", > + &entry->pdev->sbdf); Here and above I think verbosity needs limiting to the first instance per device per domain. > + return ADJACENT_DISCARD_WRITE; > + } > + > + return msix->adj_access_idx[adj_type]; > +} > + > +static int adjacent_read( > + unsigned int fixmap_idx, > + paddr_t address, unsigned int len, uint64_t *pval) > +{ > + const void __iomem *hwaddr; > + > + *pval = ~0UL; > + > + ASSERT(fixmap_idx != ADJACENT_DISCARD_WRITE); > + > + hwaddr = fix_to_virt(fixmap_idx) + PAGE_OFFSET(address); > + > + switch ( len ) > + { > + case 1: > + *pval = readb(hwaddr); > + break; > + > + case 2: > + *pval = readw(hwaddr); > + break; > + > + case 4: > + *pval = readl(hwaddr); > + break; > + > + case 8: > + *pval = readq(hwaddr); > + break; > + > + default: > + ASSERT_UNREACHABLE(); > + } > + return X86EMUL_OKAY; > +} > + > +static int adjacent_write( > + unsigned int fixmap_idx, > + uint64_t address, uint32_t len, uint64_t val) This uses indentation different from the two cases further up. Types used also don't match adjacent_read()'s. > @@ -220,16 +349,31 @@ static int cf_check msixtbl_read( > unsigned long offset; > struct msixtbl_entry *entry; > unsigned int nr_entry, index; > + unsigned int adjacent_fixmap; > int r = X86EMUL_UNHANDLEABLE; > > - if ( (len != 4 && len != 8) || (address & (len - 1)) ) > + if ( !IS_ALIGNED(address, len) ) > return r; > > rcu_read_lock(&msixtbl_rcu_lock); > - > entry = msixtbl_find_entry(current, address); > if ( !entry ) > goto out; > + > + adjacent_fixmap = adjacent_handle(entry, address, false); > + if ( adjacent_fixmap != ADJACENT_DONT_HANDLE ) > + { > + r = adjacent_read(adjacent_fixmap, address, len, pval); > + goto out; > + } > + > + if ( address < entry->gtable || > + address >= entry->gtable + entry->table_len ) > + goto out; > + > + if ( len != 4 && len != 8 ) > + goto out; > + > offset = address & (PCI_MSIX_ENTRY_SIZE - 1); > > if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET ) > @@ -282,6 +426,7 @@ static int msixtbl_write(struct vcpu *v, unsigned long > address, > int r = X86EMUL_UNHANDLEABLE; > unsigned long flags; > struct irq_desc *desc; > + unsigned int adjacent_fixmap; > > if ( !IS_ALIGNED(address, len) ) > return X86EMUL_OKAY; > @@ -291,6 +436,19 @@ static int msixtbl_write(struct vcpu *v, unsigned long > address, > entry = msixtbl_find_entry(v, address); > if ( !entry ) > goto out; > + > + adjacent_fixmap = adjacent_handle(entry, address, true); > + if ( adjacent_fixmap != ADJACENT_DONT_HANDLE ) > + { > + r = adjacent_write(adjacent_fixmap, address, len, val); > + goto out; > + } > + if ( address < entry->gtable || > + address >= entry->gtable + entry->table_len ) > + goto out; > + if ( len != 4 && len != 8 ) > + goto out; > + Can this please follow the read side as far as use of blank lines goes? > @@ -622,12 +788,15 @@ void msix_write_completion(struct vcpu *v) > v->arch.hvm.hvm_io.msix_snoop_gpa ) > { > unsigned int token = hvmemul_cache_disable(v); > - const struct msi_desc *desc; > + const struct msi_desc *desc = NULL; > + const struct msixtbl_entry *entry; > uint32_t data; > > rcu_read_lock(&msixtbl_rcu_lock); > - desc = msixtbl_addr_to_desc(msixtbl_find_entry(v, snoop_addr), > - snoop_addr); > + entry = msixtbl_find_entry(v, snoop_addr); > + if ( entry && snoop_addr >= entry->gtable && > + snoop_addr < entry->gtable + entry->table_len ) Nit: Too deep indentation. > --- a/xen/arch/x86/include/asm/msi.h > +++ b/xen/arch/x86/include/asm/msi.h > @@ -207,6 +207,10 @@ struct msg_address { > PCI_MSIX_ENTRY_SIZE + \ > (~PCI_MSIX_BIRMASK & (PAGE_SIZE - 1))) > > +/* indexes in adj_access_idx[] below */ Nit: Comment style again. > @@ -1078,6 +1108,16 @@ static void _pci_cleanup_msix(struct arch_msix *msix) > WARN(); > msix->table.first = 0; > msix->table.last = 0; > + if ( msix->adj_access_idx[ADJ_IDX_FIRST] ) > + { > + msix_put_fixmap(msix, msix->adj_access_idx[ADJ_IDX_FIRST]); > + msix->adj_access_idx[ADJ_IDX_FIRST] = 0; > + } > + if ( msix->adj_access_idx[ADJ_IDX_LAST] ) > + { > + msix_put_fixmap(msix, msix->adj_access_idx[ADJ_IDX_LAST]); > + msix->adj_access_idx[ADJ_IDX_LAST] = 0; > + } > > if ( rangeset_remove_range(mmio_ro_ranges, msix->pba.first, > msix->pba.last) ) This could probably do with another blank line at the head of the addition. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |