[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers
On 28.10.21 16:36, Roger Pau Monné wrote: > On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote: >> Hi, Julien! >> >> On 27.10.21 20:35, Julien Grall wrote: >>> Hi Oleksandr, >>> >>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote: >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> >>>> >>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not >>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as >>>> the base address may not be aligned in the way that the translation >>>> always work. If not adjusted with respect to the base address it may not be >>>> able to properly convert SBDF and crashes: >>>> >>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc >>> I can't find a printk() that may output this message. Where does this comes >>> from? >> That was a debug print. I shouldn't have used that in the patch description, >> but >> probably after "---" to better explain what's happening >>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not >>> mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM. >> This is from dom0 I am working on now. >>> IMHO, the stack trace should come from usptream Xen or need some >>> information to explain how this was reproduced. >>> >>>> (XEN) Data Abort Trap. Syndrome=0x6 >>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in >>> theory) not get the correct BDF. But... I don't understand how this would >>> result to a data abort in the hypervisor. >>> >>> In fact, I think the vPCI code should be resilient enough to not crash if >>> we pass the wrong BDF. >> Well, there is no (?) easy way to validate SBDF. And this could be a problem >> if we have a misbehaving >> guest which may force Xen to access the memory beyond that of PCI host bridge > How could that be? The ECAM region exposed to the guest you should be > the same as the physical one for dom0? Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to implement the driver for it, so I can be wrong here): - Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long - "Client" ECAM area ("config") So from Dom0 POV we have 2 ECAM regions and for the guest we always emulate a single big region: /* * 256 MB is reserved for VPCI configuration space based on calculation * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB */ #define GUEST_VPCI_ECAM_BASE xen_mk_ullong(0x10000000) #define GUEST_VPCI_ECAM_SIZE xen_mk_ullong(0x10000000) So, we have the base address and size of the emulated ECAM space not connected to the real host bridge > > And for domUs you really need to fix vpci_{read,write} to not > passthrough accesses not explicitly handled. Do you mean that we need to validate SBDFs there? This can be tricky if we have a use-case when a PCI device being passed through if not put at 0000:00:0.0, but requested to be, for example, 0000:0d:0.0. So, we need to go over the list of virtual devices and see if SBDF the guest is trying to access is a valid SBDF. Is this what you mean? > >>> When there is a data abort in Xen, you should get a stack trace from where >>> this comes from. Can you paste it here? >> (XEN) Data Abort Trap. Syndrome=0x6 >> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000 >> (XEN) 0TH[0x0] = 0x00000000481d4f7f >> (XEN) 1ST[0x1] = 0x00000000481d2f7f >> (XEN) 2ND[0x33] = 0x0000000000000000 >> (XEN) CPU0: Unexpected Trap: Data Abort >> (XEN) ----[ Xen-4.16-unstable arm64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) PC: 000000000026d3d4 pci_generic_config_read+0x88/0x9c >> (XEN) LR: 000000000026d36c >> (XEN) SP: 000080007ff97c00 >> (XEN) CPSR: 0000000060400249 MODE:64-bit EL2h (Hypervisor, handler) >> (XEN) X0: 00000000467a28bc X1: 00000000065d08bc X2: 00000000000008bc >> (XEN) X3: 000000000000000c X4: 000080007ffc6fd0 X5: 0000000000000000 >> (XEN) X6: 0000000000000014 X7: ffff800011a58000 X8: ffff0000225a0380 >> (XEN) X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028 >> (XEN) X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff >> (XEN) X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020 >> (XEN) X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c >> (XEN) X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004 >> (XEN) X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0 >> (XEN) X27: 000080007ffb9010 X28: 0000000000000000 FP: 000080007ff97c00 >> (XEN) >> (XEN) VTCR_EL2: 00000000800a3558 >> (XEN) VTTBR_EL2: 00010000bffba000 >> (XEN) >> (XEN) SCTLR_EL2: 0000000030cd183d >> (XEN) HCR_EL2: 00000000807c663f >> (XEN) TTBR0_EL2: 00000000481d5000 >> (XEN) >> (XEN) ESR_EL2: 0000000096000006 >> (XEN) HPFAR_EL2: 0000000000e65d00 >> (XEN) FAR_EL2: 00000000467a28bc >> (XEN) >> [snip] >> (XEN) Xen call trace: >> (XEN) [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC) >> (XEN) [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR) >> (XEN) [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84 >> (XEN) [<000000000026d4a8>] pci_conf_read32+0x10/0x18 >> (XEN) [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8 >> (XEN) [<000000000024e3e4>] vpci_read+0xac/0x24c >> (XEN) [<000000000024e934>] vpci_ecam_read+0x78/0xa8 >> (XEN) [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c >> (XEN) [<0000000000275054>] try_handle_mmio+0x1ec/0x264 >> (XEN) [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8 >> (XEN) [<000000000027f088>] do_trap_guest_sync+0xf0/0x618 >> (XEN) [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 0: >> (XEN) CPU0: Unexpected Trap: Data Abort >> (XEN) **************************************** > Are you exposing an ECAM region to the guest bigger than the > underlying one, and that's why you get crashes? (because you get out of > the hardware range) Please see above > I would assume physical accesses to the ECAM area reported by the > hardware don't trigger traps? No > > Roger. Thank you, Oleksandr
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |