[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Enabling AMD-Vi IOMMU panics Xen

I've finally gotten some time to look at this problem in detail again. (It's been two months!)

Firstly, thanks for trying to help me solve this issue!
I'm very much a beginner firmware dev right now, so forgive if I ask any stupid questions :)

On 15 April 2016 at 19:27, Kyösti Mälkki <kyosti.malkki@xxxxxxxxx> wrote:

On Thu, Apr 14, 2016 at 9:29 PM, 小太 <nospam@xxxxxxxx> wrote:

On Sat, Apr 9, 2016 at 10:19 PM, kmalkki <kyosti.malkki@xxxxxxxxx> wrote:
+++ b/src/cpu/amd/pi/heapmanager.c
@@ -36,7 +36,7 @@ void EmptyHeap(void)
     memset(BiosManagerPtr, 0, BIOS_HEAP_SIZE);
-#if IS_ENABLED(CONFIG_NORTHBRIDGE_AMD_PI_00630F01) && !defined(__PRE_RAM__)
+#if IS_ENABLED(CONFIG_NORTHBRIDGE_AMD_PI_00730F01) && !defined(__PRE_RAM__)

What does this change do? It doesn't look like it's used by the IOMMU

See Family 16h Model 30h-3Fh registers D0F0x98_x26 and D0F0x98_x27. AGESA makes a request to reserve 128 bytes of physical memory that is safe for DMA-like operation.

Is that documented anywhere (public)? I couldn't find anything about that, with the BKDG only mentioning what the registers are, and not how much memory needs to be allocated nor when it's allocated.
Is it just something that you've learnt from past experience with developing for AMD chips and/or its IOMMU?
[048h 0072   1]                   Entry Type : 03
[049h 0073   2]                    Device ID : 0008
[04Bh 0075   1]                 Data Setting : 00

[04Ch 0076   1]                   Entry Type : 04
[04Dh 0077   2]                    Device ID : FFFE
[04Fh 0079   1]                 Data Setting : 00

This range is 0:1.0 to ff.1f.6. The very last function is not in the range?

Would this really be an issue though? Since according to lspci, the devices on this board only go up to 03:00.0.
[050h 0080   1]                   Entry Type : 43
[051h 0081   2]                    Device ID : FF00
[053h 0083   1]                 Data Setting : 00
[054h 0084   1]                     Reserved : 00
[055h 0085   2]        Source Used Device ID : 00A4
[057h 0087   1]                     Reserved : 00

[058h 0088   1]                   Entry Type : 04
[059h 0089   2]                    Device ID : FFFF
[05Bh 0091   1]                 Data Setting : 00

This range is ff:0.0 to ff:1f.7. The source device id of 0:14.4 was previously a PCI bridge so this entry is bogus for this family of APU. Should we describe each PCIe root port / bridge in this table?

For the same reason as above, even though it is bogus, it doesn't feel like it would cause the panic.
Though I suppose a *missing* type 43 entry could cause the panic (See below).

On 4 May 2016 at 22:05, Kyösti Mälkki <kyosti.malkki@xxxxxxxxx> wrote:

Attached an additional patch to remove some invalid entry in IVRS. I was able to do PCI passthru for NIC and ssh out of the guest OS with this, but I did not really check the situation without this patch applied.

I applied all three of your patches against a clean version of the apu's coreboot source, and it still ended up panicking Xen (with the same message as before). Checking the IVRS table showed that indeed your override was applied.

A possible notable difference between our apu boards is that mine is the 4 GB version, while your's is the 2 GB, though it doesn't feel like this would be why your's works but mine doesn't.
I'm also running a newer version of Xen (4.6), and I haven't tried it with your version yet, so I don't know if it might be a regression in Xen or not.

That said, investigating a bit more, I noticed these lines in my serial output:
(XEN) PCI add device 0000:01:00.0
[    6.202363] pci 0000:00:02.2: PCI bridge to [bus 01]
(XEN) PCI add device 0000:02:00.0
[    6.204287] pci 0000:00:02.3: PCI bridge to [bus 02]
(XEN) PCI add device 0000:03:00.0
[    6.206140] pci 0000:00:02.4: PCI bridge to [bus 03]

 as well as this from lspci:
00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 16h Processor Functions 5:1
00:02.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 16h Processor Functions 5:1
00:02.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 16h Processor Functions 5:1

So maybe the cause of the panic would be that the IVRS table is missing type 43 entries for these bridges that contain the NICs?
I'll try adding those entries to the table and see if I'm successful.
Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.