[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: domain_build: Ignore device nodes with invalid addresses



On Tue, 2 Feb 2021, Julien Grall wrote:
> On 02/02/2021 18:12, Julien Grall wrote:
> > On 02/02/2021 17:47, Elliott Mitchell wrote:
> > > The handle_device() function has been returning failure upon
> > > encountering a device address which was invalid.  A device tree which
> > > had such an entry has now been seen in the wild.  As it causes no
> > > failures to simply ignore the entries, ignore them. >
> > > Signed-off-by: Elliott Mitchell <ehem+xenn@xxxxxxx>
> > > 
> > > ---
> > > I'm starting to suspect there are an awful lot of places in the various
> > > domain_build.c files which should simply ignore errors.  This is now the
> > > second place I've encountered in 2 months where ignoring errors was the
> > > correct action.
> > 
> > Right, as a counterpoint, we run Xen on Arm HW for several years now and
> > this is the first time I heard about issue parsing the DT. So while I
> > appreciate that you are eager to run Xen on the RPI...
> > 
> > >  I know failing in case of error is an engineer's
> > > favorite approach, but there seem an awful lot of harmless failures
> > > causing panics.
> > > 
> > > This started as the thread "[RFC PATCH] xen/arm: domain_build: Ignore
> > > empty memory bank".  Now it seems clear the correct approach is to simply
> > > ignore these entries.
> > 
> > ... we first need to fully understand the issues. Here a few questions:
> >     1) Can you provide more information why you believe the address is
> > invalid?
> >     2) How does Linux use the node?
> >     3) Is it happening with all the RPI DT? If not, what are the
> > differences?
> 
> So I had another look at the device-tree you provided earlier on. The node is
> the following (copied directly from the DTS):
> 
> &pcie0 {
>         pci@1,0 {
>                 #address-cells = <3>;
>                 #size-cells = <2>;
>                 ranges;
> 
>                 reg = <0 0 0 0 0>;
> 
>                 usb@1,0 {
>                         reg = <0x10000 0 0 0 0>;
>                         resets = <&reset RASPBERRYPI_FIRMWARE_RESET_ID_USB>;
>                 };
>         };
> };
> 
> pcie0: pcie@7d500000 {
>    compatible = "brcm,bcm2711-pcie";
>    reg = <0x0 0x7d500000  0x0 0x9310>;
>    device_type = "pci";
>    #address-cells = <3>;
>    #interrupt-cells = <1>;
>    #size-cells = <2>;
>    interrupts = <GIC_SPI 148 IRQ_TYPE_LEVEL_HIGH>,
>                 <GIC_SPI 148 IRQ_TYPE_LEVEL_HIGH>;
>    interrupt-names = "pcie", "msi";
>    interrupt-map-mask = <0x0 0x0 0x0 0x7>;
>    interrupt-map = <0 0 0 1 &gicv2 GIC_SPI 143
>                                                      IRQ_TYPE_LEVEL_HIGH>;
>    msi-controller;
>    msi-parent = <&pcie0>;
> 
>    ranges = <0x02000000 0x0 0xc0000000 0x6 0x00000000
>              0x0 0x40000000>;
>    /*
>     * The wrapper around the PCIe block has a bug
>     * preventing it from accessing beyond the first 3GB of
>     * memory.
>     */
>    dma-ranges = <0x02000000 0x0 0x00000000 0x0 0x00000000
>                  0x0 0xc0000000>;
>    brcm,enable-ssc;
> };
> 
> The interpretation of "reg" depends on the context. In this case, we are
> trying to interpret as a memory address from the CPU PoV when it has a
> different meaning (I am not exactly sure what).
> 
> In fact, you are lucky that Xen doesn't manage to interpret it. Xen should
> really stop trying to look region to map when it discover a PCI bus. I wrote a
> quick hack patch that should ignore it:

Yes, I think you are right. There are a few instances where "reg" is not
a address ready to be remapped. It is not just PCI, although that's the
most common.  Maybe we need a list, like skip_matches in handle_node.


> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 374bf655ee34..937fd1e387b7 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1426,7 +1426,7 @@ static int __init handle_device(struct domain *d, struct
> dt_device_node *dev,
> 
>  static int __init handle_node(struct domain *d, struct kernel_info *kinfo,
>                                struct dt_device_node *node,
> -                              p2m_type_t p2mt)
> +                              p2m_type_t p2mt, bool pci_bus)
>  {
>      static const struct dt_device_match skip_matches[] __initconst =
>      {
> @@ -1532,9 +1532,14 @@ static int __init handle_node(struct domain *d, struct
> kernel_info *kinfo,
>                 "WARNING: Path %s is reserved, skip the node as we may re-use
> the path.\n",
>                 path);
> 
> -    res = handle_device(d, node, p2mt);
> -    if ( res)
> -        return res;
> +    if ( !pci_bus )
> +    {
> +        res = handle_device(d, node, p2mt);
> +        if ( res)
> +           return res;
> +
> +        pci_bus = dt_device_type_is_equal(node, "pci");
> +    }
> 
>      /*
>       * The property "name" is used to have a different name on older FDT
> @@ -1554,7 +1559,7 @@ static int __init handle_node(struct domain *d, struct
> kernel_info *kinfo,
> 
>      for ( child = node->child; child != NULL; child = child->sibling )
>      {
> -        res = handle_node(d, kinfo, child, p2mt);
> +        res = handle_node(d, kinfo, child, p2mt, pci_bus);
>          if ( res )
>              return res;
>      }
> @@ -2192,7 +2197,7 @@ static int __init prepare_dtb_hwdom(struct domain *d,
> struct kernel_info *kinfo)
> 
>      fdt_finish_reservemap(kinfo->fdt);
> 
> -    ret = handle_node(d, kinfo, dt_host, default_p2mt);
> +    ret = handle_node(d, kinfo, dt_host, default_p2mt, false);
>      if ( ret )
>          goto err;
> 
> A less hackish possibility would be to modify dt_number_of_address() and
> return 0 when the device is a child of a PCI below.
> 
> Stefano, do you have any opinions?

Would PCIe even work today? Because if it doesn't, we could just add it
to skip_matches until we get PCI passthrough properly supported.

But aside from PCIe, let's say that we know of a few nodes for which
"reg" needs a special treatment. I am not sure it makes sense to proceed
with parsing those nodes without knowing how to deal with that. So maybe
we should add those nodes to skip_matches until we know what to do with
them. At that point, I would imagine we would introduce a special
handle_device function that knows what to do. In the case of PCIe,
something like "handle_device_pcie".

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.