[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] XSA-59


Q3:  Below is code (hopefully complete) that will check a Romley system and 
verify that the VSHDR register is correct:

    case 0x3c00 ... 0x3c0b:
        pos = pci_find_ext_capability(seg, bus, pdev->devfn,
        if ( !pos )
            if ( 0 == bus && 0 == pdev->devfn )
                dmi_aer_cap_id = pci_conf_read16(seg, 0, 0, 0, 0x148) // DMI 
Specific AER Capability ID
                if ( 0x0004 != dmi_aer_cap_id )
                    printk(XENLOG_WARNING "%04x:%02x:%02x.%u without AER 
capability?\n", seg, bus, dev, func);
                printk(XENLOG_WARNING "%04x:%02x:%02x.%u without AER 
capability?\n", seg, bus, dev, func);

Not sure your concern about versions, the code is checking for BDF 0:0.0, that 
should be sufficient.

Q4: I don't quite understand your comment about never being able to put this to 
rest.  We've identified those chipsets that have the problem and created a 
quirk for them.  We've fixed the problem in follow on chipset and our goal is 
to keep the issue fixed in all future chipsets.

Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
Ph: 303/443-3786

-----Original Message-----
From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
Sent: Thursday, February 27, 2014 4:42 AM
To: Liu, Jinsong
Cc: Mallick, Asit K; Dugger, Donald D; Li, Susie; Auld, Will; Wang, Yong Y; 
Bulygin, Yuriy; xen-devel@xxxxxxxxxxxxx
Subject: Re: XSA-59

>>> On 26.02.14 at 05:16, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
> Q1: is the PCI IDs list (0x3400 ...) of root port a complete list? Jan 
> got it from a disclosure that Intel made to him meanwhile well 
> over-two-years-ago --> Any update about the list?
> [Asit]: There is not update to this list. This was provided in 2011 
> and included the Ids prior to being fixed.

Very interesting. While hunting down the data sheets for these IDs, I found the 
Xeon C5500/C3500 and Xeon E5 v2 ones, which add new sets (370x/3720 and 0e0x).

> Q3: the "...  without AER capability?" warning triggers on Jan's 
> systems --> is it an issue? or, how to handle it properly?
> [Asit] BIOS can have option to not expose AER capability. It will be 
> good to check the BIOS setup options. The error reporting should be 
> masked so not action needed.
> [Yuriy] I expanded the answer to Q3 vs. what's in the attached email 
> after we found out that when root port is operating in DMI mode, AER ext.
> capability is not in the chain of ext. capability headers. Please use 
> this one instead.
> Answer to Q3:
> On Romley system (DID 0x3c00 ... 0x3c0b), for Host bridge BDF=00:00.0, 
> when the root port is operating as DMI, AER extended capability is 
> defined in VSHDR (Vendor Specific Header) configuration register 
> (offset 0x148). It should have value 0x0004.
> After pci_find_ext_capability, if it didn't find AER capability, for
> BDF=00:00.0 the patch would need to check if VSHDR register has value 
> 0x0004 in bits [15:0].

And it also needs to check for version 1 afaict - on that same Romley system, 
00:00.0 has another one with version 2 at 0x280 (and
80:00.0 also has such, but is running in PCIe mode).

The E5 v2 seem to be using ID 5 version 3 instead - sort of confusing.

> Q4: the patches have no way of handling future chipsets (yet we also 
> have no indication that future chipsets would not exhibit the same bad 
> behavior) --> thoughts?
> [Jinsong] IMHO handle future chipset case by case.

I.e. we would never be able to fully put to rest this XSA? Rather undesirable I 
would think.

> BTW, some other infromation from Yuriy:
> VT-d-mask-UR-host-bridge.patch:
> 1. The workaround is only applicable to the host bridge device 00:00.0 
> (DMIBAR does not exist for other devices). The patch is written 
> generically for any PCIe device/bridge.

Rather than hard coding 00:00.0, should this then - if indeed sufficient - 
perhaps simply be checking for whether the bridge a a host one? Would that 
perhaps even make sense generalizing (i.e. not looking for particular device 

And of course an even better approach would likely be to do all of this only 
for the root ports handling devices actually getting passed to guests. That 
would also address the issue of the current patch not generally dealing with 
PCI segments other than 0 (due to
pci_vtd_quirk() being called at boot time only). But that would make the 
workaround more fragile due to relying on more topology information.

While going through the data sheets again, I also began to wonder whether 
LER_{XP,}UNCERRMSK might not also need the respective bits getting set. Sadly 
the data sheets don't have any detail on what signaling LER events involves 
(i.e. namely whether these can be fatal in any way to the host).

Anyway, attached the updated patch with the uncontroversial feedback and the 
information from the other data sheets I found integrated.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.