[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xen-Error: Disabling IOMMU on Stepping C2 5520 Host-Bridge // SLH3P marking on die



Hi all,

No, the correct behavior is to just use the host bridge as it is correct and works!
Just the PCI config space is done wrongly in the board's BIOS?

To get the truth...
I disassembled the cooler, cleaned the "phase change" wax from it,
photographed the laser engraving of the flip chip die and compared
the text with the errata "spec update" by Intel.

According to the laser marking and the errata the chip is a 5520 with C2
stepping. As it has an SLH3P marking on its die. I made a photo of it,
which is available on request.
The errata sheet refers it to C2 stepping and states it supports Intel
Trusted Execution TXT. This is on page 11 (3rd line of table) of said intel errata.
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5520-and-5500-chipset-ioh-specification-update.pdf


So both Chipset errata #47, #53 mentioned in the code snippet
disabling the Vt-d feature, are not present in this hardware, so the Host bridge should 
be kosher.

For some wired reason the PCI rev is 13.
I guess, that the ID is written by the bios using
pci config cycles at early boot into registers of the host bridge to
be then displayed using tools like lspci.
Page 11 of the errata:
"3. The Revision Number corresponds to bits 7:0 of the Revision ID Register located at offset 08h in the PCI
function 0 configuration space
"


But in general:
This is not Windows, so I would expect a kernel boot option
to just say "I ignore your warning, and when a black hole forms in my mainboard
it is my fault" so force_5520_C2=1 or something like this should be appropriate.
So a small readme would advise the people who are affected by a fleaky implementation
of C2 Hostbridge to give it a try! So what should happen?!
Loose all your data on a freshly installed qubes-os?!
Oh, I forgot my hdd password, and forgot to write it under the keyboard ;-) , so
I need to reinstall.
What is the difference. Computers should do what the user wants them to do,
and when they break it is the fault of the user who ordered them to fail.

So please add a kernel boot option to just go against this if-statement,
so only a warning is printed into the log but IOMMMU is not disabled:
if ( rev == 0x13 && device == 0x342e8086 )
{
if (force_5520_C2==1)
{
printk(XENLOG_WARNING VTDPREFIX "NOT Disabling IOMMU as you requested force_5520_C2=1 and ignoring Intel 5500/5520/X58 Chipset errata #47, #53\n");
}
else
{
printk(XENLOG_WARNING VTDPREFIX
"Disabling IOMMU due to Intel 5500/5520/X58 Chipset errata #47, #53\n");
iommu_enable = 0;
break;
}
}


Cheers,

luja


Am Dienstag, Juli 27, 2021 14:21 CEST, schrieb Andrew Cooper <andrew.cooper3@xxxxxxxxxx>:
 
On 25/07/2021 14:55, Marek Marczykowski-Górecki wrote:
> On Sun, Jul 25, 2021 at 02:31:17PM +0200, luja wrote:
>> This Z600 is equipped with 0B54h mainboard as can be seen with dmi-decode.
>>
>> The manual states that 0B54h mainboard has the "newer C2 stepping",
>> so it is *not* affected by Intel "spec update" (nota bene: Intel updates the
>> spec, others report erratas) bugs  
> The code above checks for rev 0x13, and the spec (page 11) clearly says that rev
> 0x13 is stepping B-3. Stepping C-2 is rev 0x22. So, if this check
> triggers for you, I'm afraid you have the affected chipset.

The ID in hardware is the authoritative information.  Sounds like the
Z600 manual is wrong.

>> So the way Xen detects the "bug" (pci rev 13) is not sufficient, as my Z600
>> shows pci rev13 with lspci but 0xB54h (board rev only on Z600) with dmidecode
>> I would suggest first to have an override xen kernel boot option to disable the disablement in this code section. Or just patch this part out of the Xen code and rebuild xen. If this stuff really crashes, one will see it.
> Patching it out is out of the question, this check if there for a
> reason.

Using interrupt remapping on these systems does cause it to cease
functioning.

>> So please build a new xen without this stupid disablement or please add an override boot command for it.
>>
>> Please see the attached upgrade manual of Z600 and the errata "spec update" by Intel.
>> You see that the C2 stepping is not affected by the bugs refered to in the xen code,
>> so removing that section or adding better detection of the mask revision (B3 vs. C2)  of 5520 host bridge would allow  many users to operate Qubes4.
> Maybe someone else has an alternative idea?

The logic in Xen is broken.  I've tried fixing it before for XenServer,
but was objected to, and the patch is still in the patchqueue.

The errata is with the Queued Invalidation, which (in Xen) is tied to
interrupt remapping.  The rest of the IOMMU works fine.

The current status quo is that if Xen boots with an Intel gen1 IOMMU, it
will be happy with DMA remapping but no IRQ remapping.  If Xen boots on
this specific buggy system, it will turn the entire IOMMU off in
protest, which leaves the system less secure than booting on the
previous generation of hardware.

The correct behaviour is to just disable interrupt remapping in this
case, which brings Xen's behaviour in line with adjacent generations of
hardware.

~Andrew
 



 

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.