[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH][XSA-126] xen: limit guest control of PCI command register



>>> On 07.06.15 at 08:23, <mst@xxxxxxxxxx> wrote:
> On Mon, Apr 20, 2015 at 04:32:12PM +0200, Michael S. Tsirkin wrote:
>> On Mon, Apr 20, 2015 at 03:08:09PM +0100, Jan Beulich wrote:
>> > >>> On 20.04.15 at 15:43, <mst@xxxxxxxxxx> wrote:
>> > > On Mon, Apr 13, 2015 at 01:51:06PM +0100, Jan Beulich wrote:
>> > >> >>> On 13.04.15 at 14:47, <mst@xxxxxxxxxx> wrote:
>> > >> > Can you check device capabilities register, offset 0x4 within
>> > >> > pci express capability structure?
>> > >> > Bit 15 is 15 Role-Based Error Reporting.
>> > >> > Is it set?
>> > >> > 
>> > >> > The spec says:
>> > >> > 
>> > >> >       15
>> > >> >       On platforms where robust error handling and PC-compatible 
>> > >> > Configuration 
>> > >> > Space probing is
>> > >> >       required, it is suggested that software or firmware have the 
>> > >> > Unsupported 
>> > >> > Request Reporting Enable
>> > >> >       bit Set for Role-Based Error Reporting Functions, but clear for 
>> > >> > 1.0a 
>> > >> > Functions. Software or
>> > >> >       firmware can distinguish the two classes of Functions by 
>> > >> > examining the 
>> > >> > Role-Based Error Reporting
>> > >> >       bit in the Device Capabilities register.
>> > >> 
>> > >> Yes, that bit is set.
>> > > 
>> > > curiouser and curiouser.
>> > > 
>> > > So with functions that do support Role-Based Error Reporting, we have
>> > > this:
>> > > 
>> > > 
>> > >  With device Functions implementing Role-Based Error Reporting, setting 
>> > > the 
>> > > Unsupported Request
>> > >  Reporting Enable bit will not interfere with PC-compatible 
>> > > Configuration 
>> > > Space probing, assuming
>> > >  that the severity for UR is left at its default of non-fatal. However, 
>> > > setting the Unsupported Request
>> > >  Reporting Enable bit will enable the Function to report UR errors 97 
>> > > detected with posted Requests,
>> > >  helping avoid this case for potential silent data corruption.
>> > 
>> > I still don't see what the PC-compatible config space probing has to
>> > do with our issue.
>> 
>> I'm not sure but I think it's listed here because it causes a ton of URs
>> when device scan probes unimplemented functions.
>> 
>> > > did firmware reconfigure this device to report URs as fatal errors then?
>> > 
>> > No, the Unsupported Request Error Serverity flag is zero.
>> 
>> OK, that's the correct configuration, so how come the box crashes when
>> there's a UR then?
> 
> Ping - any update on this?

Not really. All we concluded so far is that _maybe_ the bridge, upon
seeing the UR, generates a Master Abort, rendering the whole thing
fatal. Otoh the respective root port also has
- Received Master Abort set in its Secondary Status register (but
  that's also already the case in the log that we have before the UR
  occurs, i.e. that doesn't mean all that much),
- Received System Error set in its Secondary Status register (and
  after the UR the sibling endpoint [UR originating from 83:00.0,
  sibling being 83:00.1] also shows Signaled System Error set).

> Do we can chalk this up to hardware bugs on a specific box?

I have to admit that I'm still very uncertain whether to consider all
this correct behavior, a firmware flaw, or a hardware bug.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.