[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.21] vpci/msix: improve handling of bogus MSI-X capabilities


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 7 Oct 2025 09:16:31 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=p/aOhYhsczh2SPcNmLu63CTZyXv4Gdv49lmN3F+z6WY=; b=jmn0p7M9SQm3Wy0yYsbGoxUdXmr/VA8QehVdab3l4uefWMA9xfQeA/pRPL6DhQYIroZ1hCEXa6k/ozbqLPGfXcL5qwu9PRP3yyE+u+f7oSgA4ioYPHNZabDMu/X5D5P+0caBTnCQ7v1B6UJmyAPj+tj0OVjrjfjUWPb3BFJFTl0e8aiJwb8Sp2uiU5sDWeWO9RwKAMToiojb12eLg16TN9x9J9c1UTpXkdXP4G7GfWrPrxxyFVUMcGhBda6yGUtruVVsq+DrxL9dBSdYn6NuwiAvlaZyFTviB8au/m5B/Uuvk36HNdNhGLwJzZ9+fzABFY4+uCGMAtxZSDqJtCVfsw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=H6syiERVYXl+MfdWBiKOCoOXrIvpUnAV5Mw2sSeDjz38/EdJKc6/E0EeDPh2rFNhflRcO/sB8W1Oyrr8LI1jvcYQ1zRfiGE06qFou9Izhxni2/aIoqR7URR8fyqLa+4bsoljCWwVcUucfotBBsMzy4fekNzTKtR7aTJ6Q3+AI5tn9WWBUr/R1wMKqauPrnLq5zNlFv6jZgUQEs1M0BVUlLwIzmcN7RyS+zwqrh/Tr6RAKVXnyl8sVQTnuOKVVfxCtkbiXKU/WtTXmqDEDj5BkyWph4rO0U/RUsWpuh3vbV1QpjvE/WHzcAJQWpjkGaFD/L5LCt4ilJ5HxiNNvtVvcg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 07 Oct 2025 07:16:56 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Oct 06, 2025 at 03:29:11PM +0100, Andrew Cooper wrote:
> On 06/10/2025 2:55 pm, Alejandro Vallejo wrote:
> > On Tue Sep 30, 2025 at 2:57 PM CEST, Roger Pau Monné wrote:
> >> On Tue, Sep 30, 2025 at 11:15:01AM +0200, Alejandro Vallejo wrote:
> >>> On Mon Sep 29, 2025 at 10:41 AM CEST, Roger Pau Monne wrote:
> >>>> I've had the luck to come across a PCI card that exposes a MSI-X 
> >>>> capability
> >>>> where the BIR of the vector and PBA tables points at a BAR that has 0 
> >>>> size.
> >>>>
> >>>> This doesn't play nice with the code in vpci_make_msix_hole(), as it 
> >>>> would
> >>>> still use the address of such empty BAR (0) and attempt to crave a hole 
> >>>> in
> >>>> the p2m.  This leads to errors like the one below being reported by Xen:
> >>>>
> >>>> d0v0 0000:22:00.0: existing mapping (mfn: 181c4300 type: 0) at 0 
> >>>> clobbers MSIX MMIO area
> >>>>
> >>>> And the device left unable to enable memory decoding due to the failure
> >>>> reported by vpci_make_msix_hole().
> >>>>
> >>>> Introduce checking in init_msix() to ensure the BARs containing the MSI-X
> >>>> tables are usable.  This requires checking that the BIR points to a
> >>>> non-empty BAR, and the offset and size of the MSI-X tables can fit in the
> >>>> target BAR.
> >>>>
> >>>> This fixes booting PVH dom0 on Supermicro AS -2126HS-TN severs with AMD
> >>>> EPYC 9965 processors.  The broken device is:
> >>>>
> >>>> 22:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA 
> >>>> Controller [AHCI mode] (rev 93)
> >>>>
> >>>> There are multiple of those integrated controllers in the system, all
> >>>> broken in the same way.
> >>>>
> >>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >>>> ---
> >>>> Cc: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
> >>>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> >>>> Cc: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
> >>>>
> >>>> While not strictly a bugfix, I consider this a worthy improvement so that
> >>>> PVH dom0 has a chance to boot on hardware that exposes such broken MSI-X
> >>>> capabilities.  Hence I think this change should be considered for 
> >>>> inclusion
> >>>> into 4.21.  There a risk of regressing on hardware that was already 
> >>>> working
> >>>> with PVH, but given enough testing that should be minimal.
> >>>> ---
> >>>>  xen/drivers/vpci/msix.c | 50 ++++++++++++++++++++++++++++++++++++-----
> >>>>  1 file changed, 45 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> >>>> index 54a5070733aa..8458955d5bbb 100644
> >>>> --- a/xen/drivers/vpci/msix.c
> >>>> +++ b/xen/drivers/vpci/msix.c
> >>>> @@ -675,6 +675,51 @@ static int cf_check init_msix(struct pci_dev *pdev)
> >>>>      if ( !msix )
> >>>>          return -ENOMEM;
> >>>>  
> >>>> +    msix->tables[VPCI_MSIX_TABLE] =
> >>>> +        pci_conf_read32(pdev->sbdf, msix_table_offset_reg(msix_offset));
> >>>> +    msix->tables[VPCI_MSIX_PBA] =
> >>>> +        pci_conf_read32(pdev->sbdf, msix_pba_offset_reg(msix_offset));
> >>>> +
> >>>> +    /* Check that the provided BAR is valid. */
> >>>> +    for ( i = 0; i < ARRAY_SIZE(msix->tables); i++ )
> >>>> +    {
> >>>> +        const char *name = (i == VPCI_MSIX_TABLE) ? "vector" : "PBA";
> >>>> +        const struct vpci_bar *bars = pdev->vpci->header.bars;
> >>>> +        unsigned int bir = msix->tables[i] & PCI_MSIX_BIRMASK;
> >>>> +        unsigned int type;
> >>>> +        unsigned int offset = msix->tables[i] & ~PCI_MSIX_BIRMASK;
> >>>> +        unsigned int size =
> >>>> +            (i == VPCI_MSIX_TABLE) ? max_entries * PCI_MSIX_ENTRY_SIZE
> >>>> +                                   : ROUNDUP(DIV_ROUND_UP(max_entries, 
> >>>> 8), 8);
> >>>> +
> >>>> +        if ( bir >= ARRAY_SIZE(pdev->vpci->header.bars) )
> >>>> +        {
> >>>> +            printk(XENLOG_ERR "%pp: MSI-X %s table with out of range 
> >>>> BIR %u\n",
> >>>> +                   &pdev->sbdf, name, bir);
> >>> Would it be worth adding something here such that a device vendor testing 
> >>> their
> >>> hardware under Xen can trivially grep for device bugs?
> >>>
> >>> Something akin to "[Firmware bug]" on Linux, like "[Device bug]" or some 
> >>> such.
> >>>
> >>> It would also let anyone not very knowledgeable about PCI know that a 
> >>> device
> >>> they own is being unreasonable. Same below in the other XENLOG_ERR 
> >>> messages.
> >> We could add indeed.  I don't think we haven't done so in the past.
> >> If we go that route I would suggest that I add a:
> >>
> >> #define DEVICE_BUG_PREFIX "[Device bug] "
> >>
> >> in lib.h or similar, to make sure we use the same prefix uniformly.
> >> TBH
> 
> We have several FIRMWARE BUG's in Xen already, and several more that
> ought to move to this pattern.
> 
> Given that Linux has definitely been booted on this hardware, we should
> match whichever prefix they use for messages about this.

I don't think Linux prints any message about this, it simply ignores
the capability.

We have another instance of having to support buggy devices in vPCI:
when a device places registers in the same 4K page as the MSI-X vector
or PBA tables.  In that case the offending device was an Intel
Wireless card.

I'm happy to use "[Device Bug]", will adjust the patch this afternoon.

> What's unclear is whether AMD can even fix this with a firmware update. 
> I would have expected that the PCIe hardblock would have prevented
> making this mistake, but clearly not...

I didn't want to point fingers :), I have no idea if it can be fixed
in firmware.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.