Xen project Mailing List

Re: [PATCH for-4.21] vpci/msix: improve handling of bogus MSI-X capabilities

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 7 Oct 2025 09:16:31 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=p/aOhYhsczh2SPcNmLu63CTZyXv4Gdv49lmN3F+z6WY=; b=jmn0p7M9SQm3Wy0yYsbGoxUdXmr/VA8QehVdab3l4uefWMA9xfQeA/pRPL6DhQYIroZ1hCEXa6k/ozbqLPGfXcL5qwu9PRP3yyE+u+f7oSgA4ioYPHNZabDMu/X5D5P+0caBTnCQ7v1B6UJmyAPj+tj0OVjrjfjUWPb3BFJFTl0e8aiJwb8Sp2uiU5sDWeWO9RwKAMToiojb12eLg16TN9x9J9c1UTpXkdXP4G7GfWrPrxxyFVUMcGhBda6yGUtruVVsq+DrxL9dBSdYn6NuwiAvlaZyFTviB8au/m5B/Uuvk36HNdNhGLwJzZ9+fzABFY4+uCGMAtxZSDqJtCVfsw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=H6syiERVYXl+MfdWBiKOCoOXrIvpUnAV5Mw2sSeDjz38/EdJKc6/E0EeDPh2rFNhflRcO/sB8W1Oyrr8LI1jvcYQ1zRfiGE06qFou9Izhxni2/aIoqR7URR8fyqLa+4bsoljCWwVcUucfotBBsMzy4fekNzTKtR7aTJ6Q3+AI5tn9WWBUr/R1wMKqauPrnLq5zNlFv6jZgUQEs1M0BVUlLwIzmcN7RyS+zwqrh/Tr6RAKVXnyl8sVQTnuOKVVfxCtkbiXKU/WtTXmqDEDj5BkyWph4rO0U/RUsWpuh3vbV1QpjvE/WHzcAJQWpjkGaFD/L5LCt4ilJ5HxiNNvtVvcg==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 07 Oct 2025 07:16:56 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Oct 06, 2025 at 03:29:11PM +0100, Andrew Cooper wrote: > On 06/10/2025 2:55 pm, Alejandro Vallejo wrote: > > On Tue Sep 30, 2025 at 2:57 PM CEST, Roger Pau Monné wrote: > >> On Tue, Sep 30, 2025 at 11:15:01AM +0200, Alejandro Vallejo wrote: > >>> On Mon Sep 29, 2025 at 10:41 AM CEST, Roger Pau Monne wrote: > >>>> I've had the luck to come across a PCI card that exposes a MSI-X > >>>> capability > >>>> where the BIR of the vector and PBA tables points at a BAR that has 0 > >>>> size. > >>>> > >>>> This doesn't play nice with the code in vpci_make_msix_hole(), as it > >>>> would > >>>> still use the address of such empty BAR (0) and attempt to crave a hole > >>>> in > >>>> the p2m. This leads to errors like the one below being reported by Xen: > >>>> > >>>> d0v0 0000:22:00.0: existing mapping (mfn: 181c4300 type: 0) at 0 > >>>> clobbers MSIX MMIO area > >>>> > >>>> And the device left unable to enable memory decoding due to the failure > >>>> reported by vpci_make_msix_hole(). > >>>> > >>>> Introduce checking in init_msix() to ensure the BARs containing the MSI-X > >>>> tables are usable. This requires checking that the BIR points to a > >>>> non-empty BAR, and the offset and size of the MSI-X tables can fit in the > >>>> target BAR. > >>>> > >>>> This fixes booting PVH dom0 on Supermicro AS -2126HS-TN severs with AMD > >>>> EPYC 9965 processors. The broken device is: > >>>> > >>>> 22:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA > >>>> Controller [AHCI mode] (rev 93) > >>>> > >>>> There are multiple of those integrated controllers in the system, all > >>>> broken in the same way. > >>>> > >>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >>>> --- > >>>> Cc: Stewart Hildebrand <stewart.hildebrand@xxxxxxx> > >>>> Cc: Jan Beulich <jbeulich@xxxxxxxx> > >>>> Cc: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> > >>>> > >>>> While not strictly a bugfix, I consider this a worthy improvement so that > >>>> PVH dom0 has a chance to boot on hardware that exposes such broken MSI-X > >>>> capabilities. Hence I think this change should be considered for > >>>> inclusion > >>>> into 4.21. There a risk of regressing on hardware that was already > >>>> working > >>>> with PVH, but given enough testing that should be minimal. > >>>> --- > >>>> xen/drivers/vpci/msix.c | 50 ++++++++++++++++++++++++++++++++++++----- > >>>> 1 file changed, 45 insertions(+), 5 deletions(-) > >>>> > >>>> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c > >>>> index 54a5070733aa..8458955d5bbb 100644 > >>>> --- a/xen/drivers/vpci/msix.c > >>>> +++ b/xen/drivers/vpci/msix.c > >>>> @@ -675,6 +675,51 @@ static int cf_check init_msix(struct pci_dev *pdev) > >>>> if ( !msix ) > >>>> return -ENOMEM; > >>>> > >>>> + msix->tables[VPCI_MSIX_TABLE] = > >>>> + pci_conf_read32(pdev->sbdf, msix_table_offset_reg(msix_offset)); > >>>> + msix->tables[VPCI_MSIX_PBA] = > >>>> + pci_conf_read32(pdev->sbdf, msix_pba_offset_reg(msix_offset)); > >>>> + > >>>> + /* Check that the provided BAR is valid. */ > >>>> + for ( i = 0; i < ARRAY_SIZE(msix->tables); i++ ) > >>>> + { > >>>> + const char *name = (i == VPCI_MSIX_TABLE) ? "vector" : "PBA"; > >>>> + const struct vpci_bar *bars = pdev->vpci->header.bars; > >>>> + unsigned int bir = msix->tables[i] & PCI_MSIX_BIRMASK; > >>>> + unsigned int type; > >>>> + unsigned int offset = msix->tables[i] & ~PCI_MSIX_BIRMASK; > >>>> + unsigned int size = > >>>> + (i == VPCI_MSIX_TABLE) ? max_entries * PCI_MSIX_ENTRY_SIZE > >>>> + : ROUNDUP(DIV_ROUND_UP(max_entries, > >>>> 8), 8); > >>>> + > >>>> + if ( bir >= ARRAY_SIZE(pdev->vpci->header.bars) ) > >>>> + { > >>>> + printk(XENLOG_ERR "%pp: MSI-X %s table with out of range > >>>> BIR %u\n", > >>>> + &pdev->sbdf, name, bir); > >>> Would it be worth adding something here such that a device vendor testing > >>> their > >>> hardware under Xen can trivially grep for device bugs? > >>> > >>> Something akin to "[Firmware bug]" on Linux, like "[Device bug]" or some > >>> such. > >>> > >>> It would also let anyone not very knowledgeable about PCI know that a > >>> device > >>> they own is being unreasonable. Same below in the other XENLOG_ERR > >>> messages. > >> We could add indeed. I don't think we haven't done so in the past. > >> If we go that route I would suggest that I add a: > >> > >> #define DEVICE_BUG_PREFIX "[Device bug] " > >> > >> in lib.h or similar, to make sure we use the same prefix uniformly. > >> TBH > > We have several FIRMWARE BUG's in Xen already, and several more that > ought to move to this pattern. > > Given that Linux has definitely been booted on this hardware, we should > match whichever prefix they use for messages about this. I don't think Linux prints any message about this, it simply ignores the capability. We have another instance of having to support buggy devices in vPCI: when a device places registers in the same 4K page as the MSI-X vector or PBA tables. In that case the offending device was an Intel Wireless card. I'm happy to use "[Device Bug]", will adjust the patch this afternoon. > What's unclear is whether AMD can even fix this with a firmware update. > I would have expected that the PCIe hardblock would have prevented > making this mistake, but clearly not... I didn't want to point fingers :), I have no idea if it can be fixed in firmware. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.