[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
Hi, Stefano! On 15.09.21 23:19, Stefano Stabellini wrote: > On Wed, 15 Sep 2021, Stefano Stabellini wrote: >> On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote: >>> On 15.09.21 03:36, Stefano Stabellini wrote: >>>> On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote: >>>>> With the patch above I have the following log in Domain-0: >>>>> >>>>> generic-armv8-xt-dom0 login: root >>>>> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type >>>>> 'CTRL-a' three times to switch input) >>>>> (XEN) ==== PCI devices ==== >>>>> (XEN) ==== segment 0000 ==== >>>>> (XEN) 0000:03:00.0 - d0 - node -1 >>>>> (XEN) 0000:02:02.0 - d0 - node -1 >>>>> (XEN) 0000:02:01.0 - d0 - node -1 >>>>> (XEN) 0000:02:00.0 - d0 - node -1 >>>>> (XEN) 0000:01:00.0 - d0 - node -1 >>>>> (XEN) 0000:00:00.0 - d0 - node -1 >>>>> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input) >>>>> >>>>> root@generic-armv8-xt-dom0:~# modprobe e1000e >>>>> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver >>>>> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. >>>>> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002) >>>>> (XEN) map [e0000, e001f] -> 0xe0000 for d0 >>>>> (XEN) map [e0020, e003f] -> 0xe0020 for d0 >>>>> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) >>>>> set to dynamic conservative mode >>>>> [ 46.189668] pci_msi_setup_msi_irqs >>>>> [ 46.191016] nwl_compose_msi_msg msg at fe440000 >>>>> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 >>>>> gva=0xffff800010fa5000 gpa=0x000000e0040000 >>>>> [ 46.200455] Unhandled fault at 0xffff800010fa5000 >>>>> >>>>> [snip] >>>>> >>>>> [ 46.233079] Call trace: >>>>> [ 46.233559] __pci_write_msi_msg+0x70/0x180 >>>>> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30 >>>>> [ 46.234869] msi_domain_activate+0x5c/0x88 >>>>> >>>>> From the above you can see that BARs are mapped for Domain-0 now >>>>> >>>>> only when an assigned PCI device gets enabled in Domain-0. >>>>> >>>>> Another thing to note is that we crash on MSI-X access as BARs are mapped >>>>> >>>>> to the domain while enabling memory decoding in the COMMAND register, >>>>> >>>>> but MSI-X are handled differently, e.g. we have MSI-X holes in the >>>>> mappings. >>>>> >>>>> This is because before this change the whole PCI aperture was mapped into >>>>> >>>>> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there >>>>> >>>>> was no need to do so, e.g. they were always mapped into Domain-0 and >>>>> >>>>> emulated for guests. >>>>> >>>>> Please note that one cannot use xl pci-attach in this case to attach the >>>>> PCI device >>>>> >>>>> in question to Domain-0 as (please see the log) that device is already >>>>> attached. >>>>> >>>>> Neither it can be detached and re-attached. So, without mapping MSI-X >>>>> holes for >>>>> >>>>> Domain-0 the device becomes unusable in Domain-0. At the same time the >>>>> device >>>>> >>>>> can be successfully passed to DomU. >>>>> >>>>> >>>>> Julien, Stefano! Please let me know how can we proceed with this. >>>> What was the plan for MSI-X in Dom0? >>> It just worked because we mapped everything >>>> Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than >>>> a physical-ITS and physical-GIC, I imagine that it wasn't correct for >>>> Dom0 to write to the real MSI-X table directly? >>>> >>>> Shouldn't Dom0 get emulated MSI-X tables like any DomU? >>>> >>>> Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then >>>> I would suggest to map them at the same time as the BARs. But I am >>>> thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X >>>> tables. >>> Yes, it seems more than reasonable to enable emulation for Domain-0 >>> >>> as well. Other than that, Stefano, do you think we are good to go with >>> >>> the changes I did in order to unmap everything for Domain-0? >> >> It might be better to resend the series with the patch in it, because it >> is difficult to review the patch like this. This is true. Taking into account Xen release plan I am just trying to minimize the turn around here. Sorry about this >> Nonetheless I tried, but I >> might have missed something. Thank you for your time!! >> >> Previously the whole PCIe bridge aperture was mapped to Dom0, and >> it was done by map_range_to_domain, is that correct? Yes, but not only the aperture: please see below. >> >> Now this patch, to avoid mapping the entire aperture to Dom0, is >> skipping any operations for PCIe devices in map_range_to_domain by >> setting need_mapping = false. >> >> The idea is that instead, we'll only map things when needed and not the >> whole aperture. However, looking at the changes to >> pci_host_bridge_mappings (formerly known as >> pci_host_bridge_need_p2m_mapping), it is still going through the full >> list of address ranges of the PCIe bridge and map each range one by one >> using map_range_to_domain. Also, pci_host_bridge_mappings is still >> called unconditionally at boot for Dom0. >> >> So I am missing the part where the aperture is actually *not* mapped to >> Dom0. With map_range_to_domain we also mapped all the entries of the "reg" and "ranges" properties. Let's have a look at [1]: - ranges : As described in IEEE Std 1275-1994, but must provide at least a definition of non-prefetchable memory. One or both of prefetchable Memory and IO Space may also be provided. - reg : The Configuration Space base address and size, as accessed from the parent bus. The base address corresponds to the first bus in the "bus-range" property. If no "bus-range" is specified, this will be bus 0 (the default). The most interesting comes when "reg" also has other then configuration space addresses, for example [2]: - reg: Should contain rc_dbi, config registers location and length. - reg-names: Must include the following entries: "rc_dbi": controller configuration registers; "config": PCIe configuration space registers. So, we don't need to map "ranges" and *config* from the "reg", but all the rest from "reg" still needs to be mapped to Domain-0, so the PCIe bridge can remain functional in Domain-0. >> What's the difference between the loop in >> pci_host_bridge_mappings: >> >> for ( i = 0; i < dt_number_of_address(dev); i++ ) >> map_range_to_domain... >> >> and the previous code in map_range_to_domain? I think I am missing >> something but as mentioned it is difficult to review the patch like this >> out of order. >> >> Also, and this is minor, even if currently unused, it might be good to >> keep a length parameter to pci_host_bridge_need_p2m_mapping. > It looks like the filtering is done based on: > > return cfg->phys_addr != addr As I explained above it is *now* the only range that we *do not want* to be mapped to Domain-0. Other "reg" entries still need to be mapped. > > in pci_ecam_need_p2m_mapping that is expected to filter out the address > ranges we don't want to map because it comes from > xen/arch/arm/pci/pci-host-common.c:gen_pci_init: > > /* Parse our PCI ecam register address*/ > err = dt_device_get_address(dev, ecam_reg_idx, &addr, &size); > if ( err ) > goto err_exit; > > In pci_host_bridge_mappings, instead of parsing device tree again, can't > we just fetch the address and length we need to map straight from > bridge->sysdata->phys_addr/size ? We can't as the address describes the configuration space which we *do not* want to be mapped, but what we want is other than configuration entries in the "reg" property. > > At the point when pci_host_bridge_mappings is called in your new patch, > we have already initialized all the PCIe-related data structures. It > seems a bit useless to have to go via device tree again. Bottom line: we do need to go over the "reg" property and map the regions of which PCIe bridge is dependent and only skip "config" part of it. Thank you, Oleksandr [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt [2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |