[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/5] xen/arm: assign devices to boot domains



On Mon, 24 Dec 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 12/5/18 5:28 PM, Stefano Stabellini wrote:
> > Scan the user provided dtb fragment at boot. For each device node, map
> > memory to guests, and route interrupts and setup the iommu.
> > 
> > Device memory is only mapped 1:1. It is not possible to assign devices at
> > locations that conflict with the DomU memory map.
> I think 1:1 mapped is a pretty bad idea. You limit yourself a lot as the user
> does not control neither the HW nor the guest memory map.
> 
> So you need to provide a way for the user to specify the mapping.

Yes, I think we want to introduce a separate xen,reg property that
includes a destination address as well. Something of the format:

  xen,reg = <source_addr source_size dest_addr>

More on this below.


> > 
> > The iommu is setup by passing the to the device to assign on the host
> 
> NIT: "the node to..."

I'll fix

 
> > device tree. The path is specified in the device tree fragment as the
> > "path" string property.
> 
> Path is too generic and may clash in the future with other bindings. Please
> add "xen," in front.

OK, I can rename


> > 
> > Signed-off-by: Stefano Stabellini <stefanos@xxxxxxxxxx>
> > ---
> >   xen/arch/arm/bootfdt.c        |  4 +-
> >   xen/arch/arm/domain_build.c   | 85
> > +++++++++++++++++++++++++++++++++++++++++++
> >   xen/include/xen/device_tree.h |  2 +
> >   3 files changed, 89 insertions(+), 2 deletions(-)
> > 
> > diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> > index 891b4b6..72cb8d6 100644
> > --- a/xen/arch/arm/bootfdt.c
> > +++ b/xen/arch/arm/bootfdt.c
> > @@ -55,8 +55,8 @@ static bool __init device_tree_node_compatible(const void
> > *fdt, int node,
> >       return false;
> >   }
> >   -static void __init device_tree_get_reg(const __be32 **cell, u32
> > address_cells,
> > -                                       u32 size_cells, u64 *start, u64
> > *size)
> > +void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
> > +                                u32 size_cells, u64 *start, u64 *size)
> >   {
> >       *start = dt_next_cell(address_cells, cell);
> >       *size = dt_next_cell(size_cells, cell);
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index cc6b464..d48f77e 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -2094,6 +2094,88 @@ static int __init construct_domain(struct domain *d,
> > struct kernel_info *kinfo)
> >       return 0;
> >   }
> >   +static int __init scan_pt_node(const void *pfdt,
> > +                               int nodeoff, const char *name, int depth,
> > +                               u32 address_cells, u32 size_cells,
> > +                               void *data)
> 
> Is it really necessary to parse the device-tree twice?

I don't think I understand this comment. The device tree fragment is not
scanned twice, just once I think. Am I missing something?


> > +{
> > +    int rc;
> > +    struct dt_device_node *node;
> > +    int len, i;
> > +    const struct fdt_property *prop;
> > +    struct kernel_info *kinfo = data;
> > +    struct domain *d = kinfo->d;
> > +    const __be32 *cell;
> > +
> > +    if ( depth > 2 )
> > +        return 0;
> 
> Why do you limit yourself to depth 2? It is possible to have nested node
> describing memory

That was because Xen needed a far better understanding of how reg works
to be able to parse nested nodes correctly. Specifically, see this
concrete example from ZynqMP:

        ethernet@ff0e0000 {
            compatible = "cdns,zynqmp-gem";
            status = "okay";
            interrupt-parent = <0xfde8>;
            interrupts = <0x0 0x3f 0x4 0x0 0x3f 0x4>;
            reg = <0x0 0xff0e0000 0x1000>;
            clock-names = "pclk", "hclk", "tx_clk", "rx_clk";
            #address-cells = <0x1>;
            #size-cells = <0x0>;
            clocks = <0x1 0x1 0x1 0x1>;
            phy-mode = "rgmii-id";
            xlnx,ptp-enet-clock = <0x0>;
            local-mac-address = [00 0a 35 00 22 01];
            phy-handle = <0x2>;
            xen,path = "/amba/ethernet@ff0e0000";

            phy@c {
                reg = <0xc>;
                ti,rx-internal-delay = <0x8>;
                ti,tx-internal-delay = <0xa>;
                ti,fifo-depth = <0x1>;
                ti,rxctrl-strap-worka;
                linux,phandle = <0x2>;
                phandle = <0x2>;
            };
        };

We want to map the first reg property under ethernet@ff0e0000, not the
reg property under phy@c, which it doesn't even refer to a memory
region. This problem goes away completely if we start using a new Xen
specific "xen,reg" property, instead of trying to figure out which
existing reg to map. With that, we can safely remove the artificial
depth <= 2 restriction.


> > +
> > +    prop = fdt_get_property_namelen(pfdt, nodeoff, "reg", strlen("reg"),
> > &len);
> > +    if ( prop != NULL )
> > +    {
> > +        paddr_t start, size;
> > +        cell = (const __be32 *)prop->data;
> > +        len = fdt32_to_cpu(prop->len) /
> > +              ((address_cells + size_cells) * sizeof (u32));
> > +
> > +        for ( i = 0; i < len; i++ )
> > +        {
> > +            device_tree_get_reg(&cell, address_cells, size_cells, &start,
> > &size);
> 
> Here you assume the value in regs correspond to host physical address. This
> may not be the case if a device is under a bus.

Yes, one more reason to move away from parsing the existing reg property
and use a new xen,reg property that gives directly the host physical
address to map to the right guest physical address (regardless of
busses). xen,reg would basically be equivalent to the iomem VM config
file option.


> > +
> > +            rc = guest_physmap_add_entry(d, _gfn(start >> PAGE_SHIFT),
> 
> Please use gaddr_to_gfn and ...

OK


> > +                                         _mfn(start >> PAGE_SHIFT),
> 
> ... maddr_to_mfn

OK


> > +                                         get_order_from_bytes(size),
> > +                                         p2m_mmio_direct_dev);
> 
> I think the restriction on the memory attributes should be documented in patch
> #5.

OK


> > +            if ( rc < 0 )
> > +            {
> > +                dprintk(XENLOG_ERR, "Failed to map %"PRIpaddr" to the
> > guest\n", start);
> > +                return -EFAULT;
> > +            }
> > +        }
> > +    }
> > +
> > +    prop = fdt_get_property_namelen(pfdt, nodeoff, "path", strlen("path"),
> > &len);
> > +    if ( prop != NULL ) {
> > +        node = dt_find_node_by_path((char *)&prop->data[0]);
> 
> What's wrong with giving directly prop->data?

Nothing wrong with it, just a matter of code style. I don't care about
this, I can change it.


> > +        if ( node != NULL )
> > +            rc = iommu_assign_dt_device(d, node);
> > +        else
> > +            dprintk(XENLOG_ERR, "Couldn't find node %s in host_dt!\n",
> > +                    (char *)&prop->data[0]);
> 
> Same here.

Same


> > +    }
> > +
> > +    prop = fdt_get_property_namelen(pfdt, nodeoff, "interrupts",
> > strlen("interrupts"), &len);
> > +    if ( prop != NULL )
> > +    {
> > +        int pt_irq;
> > +        u32 *u = (u32*) &prop->data[0];
> 
> Same here for &prop->data[0]. But this stores a fdt_32 and not u32.

OK


> > +
> > +        pt_irq = fdt32_to_cpu(*(u + 1)) + 32;
> > +
> > +        vgic_reserve_virq(d, pt_irq);
> > +        rc = route_irq_to_guest(d, pt_irq, pt_irq, "routed IRQ");
> > +        if ( rc < 0 )
> > +            return rc;
> 
> You are assuming the device can only generate one interrupt.

That's a mistake, thanks for spotting it, I'll fix it.


> Furthermore, this is assuming all the nodes in the fragment will be
> under the GIC controller.  You may want to passthrough a interrupt
> controller (i.e GPIO) to the guest and the related device.

I don't think that the non-GIC scenario is supported today. I don't
think it works even without dom0less. I wrote more about this as a reply
to the other email.


> > +    }
> 
> NIT: newline here.

OK


> 
> > +    return 0;
> > +}
> > +
> > +static int __init domain_addign_devices(struct domain *d,
> 
> s/addign/adding/

Damn! Thanks!


> > +                                        struct kernel_info *kinfo)
> > +{
> > +    void *pfdt;
> > +
> > +    pfdt = ioremap_cache(kinfo->dtb_bootmodule->start,
> > +            kinfo->dtb_bootmodule->size);
> > +    if ( pfdt == NULL )
> > +        return -EFAULT;
> > +
> > +    device_tree_for_each_node(pfdt, scan_pt_node, kinfo);
> > +
> > +    iounmap(pfdt);
> > +    return 0;
> > +}
> > +
> >   static int __init construct_domU(struct domain *d,
> >                                    const struct dt_device_node *node)
> >   {
> > @@ -2140,6 +2222,9 @@ static int __init construct_domU(struct domain *d,
> >       if ( kinfo.vpl011 )
> >           rc = domain_vpl011_init(d, NULL);
> >   +    if ( kinfo.dtb_bootmodule )
> > +        rc = domain_addign_devices(d, &kinfo);
> > +
> >       return rc;
> >   }
> >   diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> > index 7408a6c..356a422 100644
> > --- a/xen/include/xen/device_tree.h
> > +++ b/xen/include/xen/device_tree.h
> > @@ -161,6 +161,8 @@ extern const void *device_tree_flattened;
> >   int device_tree_for_each_node(const void *fdt,
> >                                        device_tree_node_func func,
> >                                        void *data);
> > +void device_tree_get_reg(const __be32 **cell, u32 address_cells,
> > +                         u32 size_cells, u64 *start, u64 *size);
> >     /**
> >    * dt_unflatten_host_device_tree - Unflatten the host device tree

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.