[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 3/7] x86/hvm: Allow access to registers on the same page as MSI-X table



On Thu, Apr 25, 2024 at 01:15:34PM +0200, Jan Beulich wrote:
> On 13.03.2024 16:16, Marek Marczykowski-Górecki wrote:
> > Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers
> > on the same page as MSI-X table. Device model (especially one in
> > stubdomain) cannot really handle those, as direct writes to that page is
> > refused (page is on the mmio_ro_ranges list). Instead, extend
> > msixtbl_mmio_ops to handle such accesses too.
> > 
> > Doing this, requires correlating read/write location with guest
> > of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest,
> > it requires msixtbl_entry->gtable, which is HVM-only. Similar feature
> > for PV would need to be done separately.
> > 
> > This will be also used to read Pending Bit Array, if it lives on the same
> > page, making QEMU not needing /dev/mem access at all (especially helpful
> > with lockdown enabled in dom0). If PBA lives on another page, QEMU will
> > map it to the guest directly.
> > If PBA lives on the same page, discard writes and log a message.
> > Technically, writes outside of PBA could be allowed, but at this moment
> > the precise location of PBA isn't saved, and also no known device abuses
> > the spec in this way (at least yet).
> > 
> > To access those registers, msixtbl_mmio_ops need the relevant page
> > mapped. MSI handling already has infrastructure for that, using fixmap,
> > so try to map first/last page of the MSI-X table (if necessary) and save
> > their fixmap indexes. Note that msix_get_fixmap() does reference
> > counting and reuses existing mapping, so just call it directly, even if
> > the page was mapped before. Also, it uses a specific range of fixmap
> > indexes which doesn't include 0, so use 0 as default ("not mapped")
> > value - which simplifies code a bit.
> > 
> > GCC 12.2.1 gets confused about 'desc' variable:
> > 
> >     arch/x86/hvm/vmsi.c: In function ‘msixtbl_range’:
> >     arch/x86/hvm/vmsi.c:553:8: error: ‘desc’ may be used uninitialized 
> > [-Werror=maybe-uninitialized]
> >       553 |     if ( desc )
> >           |        ^
> >     arch/x86/hvm/vmsi.c:537:28: note: ‘desc’ was declared here
> >       537 |     const struct msi_desc *desc;
> >           |                            ^~~~
> > 
> > It's conditional initialization is actually correct (in the case where
> > it isn't initialized, function returns early), but to avoid
> > build failure initialize it explicitly to NULL anyway.
> > 
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> 
> Sadly there are further more or less cosmetic issues. Plus, as indicated
> before, I'm not really happy for us to gain all of this extra code. In
> the end I may eventually give an R-b not including the usually implied
> A-b, to indicate the code (then) looks okay to me but I still want
> someone else to actually ack it to allow it going in.

I understand. Given similar code is committed for vPCI already, I hope
somebody will be comfortable with acking this one too (yes, I do realize
the vPCI one is much less exposed, but still).

> > +static int adjacent_read(
> > +    unsigned int fixmap_idx,
> > +    paddr_t address, unsigned int len, uint64_t *pval)
> > +{
> > +    const void __iomem *hwaddr;
> > +
> > +    *pval = ~0UL;
> > +
> > +    ASSERT(fixmap_idx != ADJACENT_DISCARD_WRITE);
> 
> Why only one of the special values? And before you add the other here:
> Why not simply ASSERT(fixmap_idx <= FIX_MSIX_IO_RESERV_END)? (Could of
> course bound at the other end, too, i.e. against FIX_MSIX_IO_RESERV_BASE.)

That's the most likely bug that could happen, but indeed broader assert
would be better.

> > +    hwaddr = fix_to_virt(fixmap_idx) + PAGE_OFFSET(address);
> > +
> > +    switch ( len )
> > +    {
> > +    case 1:
> > +        *pval = readb(hwaddr);
> > +        break;
> > +
> > +    case 2:
> > +        *pval = readw(hwaddr);
> > +        break;
> > +
> > +    case 4:
> > +        *pval = readl(hwaddr);
> > +        break;
> > +
> > +    case 8:
> > +        *pval = readq(hwaddr);
> > +        break;
> > +
> > +    default:
> > +        ASSERT_UNREACHABLE();
> 
> Misra demands "break;" to be here for release builds. In fact I wonder
> why "*pval = ~0UL;" isn't put here, too. Question of course is whether
> in such a case a true error indicator wouldn't be yet better.

I don't think it possible for the msixtbl_read() (that calls
adjacent_read()) to be called with other sizes. The default label is
here exactly to make it obvious for the reader.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.