[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Problems in PV dom0 on recent x86 hardware
On 12/07/2024 2:46 pm, Juergen Gross wrote: > On 12.07.24 12:35, Jürgen Groß wrote: >> On 09.07.24 15:08, Jason Andryuk wrote: >>> On 2024-07-09 06:56, Jürgen Groß wrote: >>>> On 09.07.24 09:01, Jan Beulich wrote: >>>>> On 09.07.2024 08:36, Jürgen Groß wrote: >>>>>> On 09.07.24 08:24, Jan Beulich wrote: >>>>>>> On 08.07.2024 23:30, Jason Andryuk wrote: >>>>>>>> From the backtrace, it looks like the immediate case is just >>>>>>>> trying to >>>>>>>> read a 4-byte version: >>>>>>>> >>>>>>>> >>>> [ 44.575541] ucsi_acpi_dsm+0x53/0x80 >>>>>>>> >>>> [ 44.575546] ucsi_acpi_read+0x2e/0x60 >>>>>>>> >>>> [ 44.575550] ucsi_register+0x24/0xa0 >>>>>>>> >>>> [ 44.575555] ucsi_acpi_probe+0x162/0x1e3 >>>>>>>> >>>>>>>> int ucsi_register(struct ucsi *ucsi) >>>>>>>> { >>>>>>>> int ret; >>>>>>>> >>>>>>>> ret = ucsi->ops->read(ucsi, UCSI_VERSION, >>>>>>>> &ucsi->version, >>>>>>>> sizeof(ucsi->version)); >>>>>>>> >>>>>>>> ->read being ucsi_acpi_read() >>>>>>>> >>>>>>>> However, the driver also appears write to adjacent addresses. >>>>>>> >>>>>>> There are also corresponding write functions in the driver, yes, >>>>>>> but >>>>>>> ucsi_acpi_async_write() (used directly or indirectly) similarly >>>>>>> calls >>>>>>> ucsi_acpi_dsm(), which wires through to acpi_evaluate_dsm(). That's >>>>>>> ACPI object evaluation, which isn't obvious without seeing the >>>>>>> involved AML whether it might write said memory region. >>>>>> >>>>>> I guess an ACPI dump would help here? >>>>> >>>>> Perhaps, yes. >>>> >>>> It is available in the bug report: >>>> >>>> https://bugzilla.opensuse.org/show_bug.cgi?id=1227301 >>> >>> After acpixtract & iasl: >>> >>> $ grep -ir FEEC * >>> dsdt.dsl: OperationRegion (ECMM, SystemMemory, 0xFEEC2000, 0x0100) >>> ssdt16.dsl: OperationRegion (SUSC, SystemMemory, 0xFEEC2100, 0x30) >>> >>> >>> from the DSDT: >>> Scope (\_SB.PCI0.LPC0.EC0) >>> { >>> OperationRegion (ECMM, SystemMemory, 0xFEEC2000, 0x0100) >>> Field (ECMM, AnyAcc, Lock, Preserve) >>> { >>> TWBT, 2048 >>> } >>> >>> Name (BTBF, Buffer (0x0100) >>> { >>> 0x00 // . >>> }) >>> Method (BTIF, 0, NotSerialized) >>> { >>> BTBF = TWBT /* \_SB_.PCI0.LPC0.EC0_.TWBT */ >>> Return (BTBF) /* \_SB_.PCI0.LPC0.EC0_.BTBF */ >>> } >>> } >>> >>> From SSDT16: >>> DefinitionBlock ("", "SSDT", 2, "LENOVO", "UsbCTabl", 0x00000001) >>> { >>> External (_SB_.PCI0.LPC0.EC0_, DeviceObj) >>> >>> Scope (\_SB) >>> { >>> OperationRegion (SUSC, SystemMemory, 0xFEEC2100, 0x30) >>> Field (SUSC, ByteAcc, Lock, Preserve) >>> { >>> >>> >>> This embedded controller (?) seems to live at 0xfeec2xxx. >> >> What is the takeaway from that? >> >> Is this a firmware bug (if yes, pointers to a specification saying that >> this is an illegal configuration would be nice), or do we need a way to >> map this page from dom0? > > I've found the following in the AMD IOMMU spec [1]: > > Received DMA requests without PASID in the 0xFEEx_xxxx address range > are > treated as MSI interrupts and are processed using interrupt > remapping rather > than address translation. > > To me this sounds as if there wouldn't be a major risk letting dom0 map > physical addresses in this area, as long as "normal" I/Os to this area > would > result in DMA requests with a PASID. OTOH I'm not familiar with Xen IOMMU > handling, so I might be completely wrong. > > Another question would be whether a device having resources in this > area can > even work through an IOMMU. Address spaces are not fully uniform. What 0xFEEx_xxxx means/does really does differ depending on your point of view. The CPU accessing 0xFEEx_xxxx literally does different things than a PCI device accessing the same range. That's why nothing outside of the CPU can get at the LAPIC MMIO registers. No amount of remapping trickery in the IOMMU pagetables are going to change this fact. Now - the problem here is that 0xFEEx_xxxx is (for legacy reasons) "known" to be the LAPIC MMIO, which has a 4k window at the bottom and everything else in the 2M is reserved. And it appears that AMD have started putting other things into that reserved space, which are only described by AML, and not known to Xen. Xen, generally, is very wary of mappings in and around here, because it does need to prevent even dom0 having access to the interrupt controller MMIO windows (I'm including IO-APICs too). So I expect Xen is saying "that's an interrupt MMIO window, no" without knowing that there's actually something else in there. (But I am just guessing.) It comes full circle back to all the problems of Xen not being OSPM, for which we don't have a good solution. One thing that Tim Deegan suggested aages ago was to have an ACPI OSPM stubdom, and provide pv-AML allowing dom0 to do things. Importantly, it would let us do things like evaluate all methods on all processor objects, knowing that e.g. vCPUs weren't relevant. The more I think about it the more I like it, and it would allow us to start taking some of the more invasive hacks out of Linux, but at the same time it's also a giant quantity of work. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |