[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V1] iommu/arm: Add Renesas IPMMU-VMSA support



Hi,

Apologies for the late answer. I have been traveling for the past few weeks.

On 6/26/19 11:30 AM, Oleksandr Tyshchenko wrote:
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>

The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
which provides address translation and access protection functionalities
to processing units and interconnect networks.

Do you have a link to the specification?

My knowledge about the IPMMU is quite limited, so for now, I will mostly look at Xen specific code. It would be good if someone with a better knowledge of the driver could have a look.

[...]

diff --git a/xen/drivers/passthrough/Kconfig b/xen/drivers/passthrough/Kconfig
index a3c0649..e57aa29 100644
--- a/xen/drivers/passthrough/Kconfig
+++ b/xen/drivers/passthrough/Kconfig
@@ -12,4 +12,17 @@ config ARM_SMMU
Say Y here if your SoC includes an IOMMU device implementing the
          ARM SMMU architecture.
+
+config IPMMU_VMSA
+       bool "Renesas IPMMU-VMSA found in R-Car Gen3 SoCs"
+       default y

I would prefer this to be a default N for the time-being.

+       depends on ARM_64
+       ---help---
+         Support for implementations of the Renesas IPMMU-VMSA found
+         in R-Car Gen3 SoCs.
+
+         Say Y here if you are using newest R-Car Gen3 SoCs revisions which 
IPMMU

What new now will be old soon ;). So it would be best if you give a precise revision here.

+         hardware supports stage 2 translation table format and is able to use
+         CPU's P2M table as is.
+
  endif
diff --git a/xen/drivers/passthrough/arm/Makefile 
b/xen/drivers/passthrough/arm/Makefile
index b3efcfd..40ac7a9 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1,2 +1,3 @@
  obj-y += iommu.o
  obj-$(CONFIG_ARM_SMMU) += smmu.o
+obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c 
b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
new file mode 100644
index 0000000..5091c61
--- /dev/null
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -0,0 +1,1487 @@
+/*
+ * xen/drivers/passthrough/arm/ipmmu-vmsa.c
+ *
+ * Driver for the Renesas IPMMU-VMSA found in R-Car Gen3 SoCs.
+ *
+ * The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
+ * which provides address translation and access protection functionalities
+ * to processing units and interconnect networks.
+ *
+ * Please note, current driver is supposed to work only with newest Gen3 SoCs
+ * revisions which IPMMU hardware supports stage 2 translation table format and
+ * is able to use CPU's P2M table as is.
+ *
+ * Based on Linux's IPMMU-VMSA driver from Renesas BSP:
+ *    drivers/iommu/ipmmu-vmsa.c

What are the major differences compare the Linux driver?

+ * you can found at:
+ *    url: git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
+ *    branch: v4.14.75-ltsi/rcar-3.9.2
+ *    commit: a5266d298124874c2c06b8b13d073f6ecc2ee355

Is there any reason to use the BSP driver and not the one provided by Linux directly?

[...]

+#define dev_archdata(dev) ((struct ipmmu_vmsa_xen_device *)dev->archdata.iommu)
+
+/* Root/Cache IPMMU device's information */
+struct ipmmu_vmsa_device
+{

AFAICT, this file was converted to Xen coding style. If so, the style for struct is

struct ... {
...
};

+    struct device *dev;
+    void __iomem *base;
+    struct ipmmu_vmsa_device *root;
+    struct list_head list;
+    unsigned int num_utlbs;
+    unsigned int num_ctx;
+    spinlock_t lock;    /* Protects ctx and domains[] */
+    DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
+    struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
+};
+
+/*
+ * Root/Cache IPMMU domain's information
+ *
+ * Root IPMMU device is assigned to Root IPMMU domain while Cache IPMMU device
+ * is assigned to Cache IPMMU domain. Master devices are connected to Cache
+ * IPMMU devices through specific ports called micro-TLBs.
+ * All Cache IPMMU devices, in turn, are connected to Root IPMMU device
+ * which manages IPMMU context.
+ */
+struct ipmmu_vmsa_domain
+{

Ditto.

+    /*
+     * IPMMU device assigned to this IPMMU domain.
+     * Either Root device which is located at the main memory bus domain or
+     * Cache device which is located at each hierarchy bus domain.
+     */
+    struct ipmmu_vmsa_device *mmu;
+
+    /* Context used for this IPMMU domain */
+    unsigned int context_id;
+
+    /* Xen domain associated with this IPMMU domain */
+    struct domain *d;
+
+    /* The fields below are used for Cache IPMMU domain only */
+
+    /*
+     * Used to keep track of the master devices which are attached to this
+     * IPMMU domain (domain users). Master devices behind the same IPMMU device
+     * are grouped together by putting into the same IPMMU domain.
+     * Only when the refcount reaches 0 this IPMMU domain can be destroyed.
+     */
+    int refcount;

If the refcount cannot be 0, then I would prefer an unsigned int here.

+    /* Used to link this IPMMU domain for the same Xen domain */
+    struct list_head list;
+};


[...]

+/* Read/Write Access */
+static u32 ipmmu_read(struct ipmmu_vmsa_device *mmu, unsigned int offset)

If you are going to use Xen coding style, then this should be "uint32_t". The same is valid everywhere in this file, I am not going to mention all of them :).

+{
+    return readl(mmu->base + offset);
+}
+
+static void ipmmu_write(struct ipmmu_vmsa_device *mmu, unsigned int offset,
+                        u32 data)
+{
+    writel(data, mmu->base + offset);
+}
+
+static u32 ipmmu_ctx_read_root(struct ipmmu_vmsa_domain *domain,
+                               unsigned int reg)
+{
+    return ipmmu_read(domain->mmu->root,
+                      domain->context_id * IM_CTX_SIZE + reg);
+}
+
+static void ipmmu_ctx_write_root(struct ipmmu_vmsa_domain *domain,
+                                 unsigned int reg, u32 data)
+{
+    ipmmu_write(domain->mmu->root,
+                domain->context_id * IM_CTX_SIZE + reg, data);
+}
+
+static void ipmmu_ctx_write_cache(struct ipmmu_vmsa_domain *domain,
+                                  unsigned int reg, u32 data)
+{
+    ASSERT(reg == IMCTR);

What's the rationale of passing reg in parameter if it can only be equal to IMCTR?

+
+    /* Mask fields which are implemented in IPMMU-MM only. */
+    if ( !ipmmu_is_root(domain->mmu) )
+        ipmmu_write(domain->mmu, domain->context_id * IM_CTX_SIZE + reg,
+                    data & IMCTR_COMMON_MASK);
+}

[...]

+/* Enable MMU translation for the micro-TLB. */
+static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain,
+                              unsigned int utlb)
+{
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+
+    /*
+     * TODO: Reference-count the micro-TLB as several bus masters can be
+     * connected to the same micro-TLB.
+     */

What would be the exact problem if this is not handled? Could a utlb shared between multiple domain?

+    ipmmu_write(mmu, IMUASID(utlb), 0);
+    ipmmu_write(mmu, IMUCTR(utlb), ipmmu_read(mmu, IMUCTR(utlb)) |
+                IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_MMUEN);
+}

[...]

+static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
+{
+    u64 ttbr;

s/u64/uint64_t/

+    u32 tsz0;
+    int ret;
+
+    /* Find an unused context. */
+    ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
+    if ( ret < 0 )
+        return ret;
+
+    domain->context_id = ret;
+
+    /*
+     * TTBR0
+     * Use P2M table for this Xen domain.
+     */
+    ASSERT(domain->d != NULL);
+    ttbr = page_to_maddr(domain->d->arch.p2m.root);
+
+    dev_info(domain->mmu->root->dev, "d%d: Set IPMMU context %u (pgd 
0x%"PRIx64")\n",

We introduce a format specifier to print a domain. So you can use %pd in combination of just domain->d.

+             domain->d->domain_id, domain->context_id, ttbr);
+
+    ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr & IMTTLBR0_TTBR_MASK);
+    ipmmu_ctx_write_root(domain, IMTTUBR0, (ttbr >> 32) & IMTTUBR0_TTBR_MASK);
+
+    /*
+     * TTBCR
+     * We use long descriptors with inner-shareable WBWA tables and allocate
+     * the whole "p2m_ipa_bits" IPA space to TTBR0. Use 4KB page granule.
+     * Start page table walks at first level. Bypass stage 1 translation
+     * when only stage 2 translation is performed.

I am not sure to understand the last sentence. You only use stage2 right, so it shouldn't it be "Always bypass stage 1 translation"?

+     */
+    tsz0 = (64 - p2m_ipa_bits) << IMTTBCR_TSZ0_SHIFT;
+    ipmmu_ctx_write_root(domain, IMTTBCR, IMTTBCR_EAE | IMTTBCR_PMB |
+                         IMTTBCR_SH0_INNER_SHAREABLE | IMTTBCR_ORGN0_WB_WA |
+                         IMTTBCR_IRGN0_WB_WA | IMTTBCR_SL0_LVL_1 | tsz0);

[...]

+/* Fault Handling */
+static void ipmmu_domain_irq(struct ipmmu_vmsa_domain *domain)
+{
+    const u32 err_mask = IMSTR_MHIT | IMSTR_ABORT | IMSTR_PF | IMSTR_TF;
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+    u32 status;
+    u64 iova;
+
+    status = ipmmu_ctx_read_root(domain, IMSTR);
+    if ( !(status & err_mask) )
+        return;
+
+    iova = ipmmu_ctx_read_root(domain, IMELAR) |
+                               ((u64)ipmmu_ctx_read_root(domain, IMEUAR) << 
32);
+
+    /*
+     * Clear the error status flags. Unlike traditional interrupt flag
+     * registers that must be cleared by writing 1, this status register
+     * seems to require 0. The error address register must be read before,
+     * otherwise its value will be 0.
+     */
+    ipmmu_ctx_write_root(domain, IMSTR, 0);
+
+    /* Log fatal errors. */
+    if ( status & IMSTR_MHIT )
+        dev_err_ratelimited(mmu->dev, "d%d: Multiple TLB hits @0x%"PRIx64"\n",

Similar remark for d%d here ...

+                            domain->d->domain_id, iova);
+    if ( status & IMSTR_ABORT )
+        dev_err_ratelimited(mmu->dev, "d%d: Page Table Walk Abort 
@0x%"PRIx64"\n",

... here and ...
+                            domain->d->domain_id, iova);
+
+    /* Return if it is neither Permission Fault nor Translation Fault. */
+    if ( !(status & (IMSTR_PF | IMSTR_TF)) )
+        return;
+
+    /* Flush the TLB as required when IPMMU translation error occurred. */
+    ipmmu_tlb_invalidate(domain);
+
+    dev_err_ratelimited(mmu->dev, "d%d: Unhandled fault: status 0x%08x iova 
0x%"PRIx64"\n",

... here.

+                        domain->d->domain_id, status, iova);
+}
+
+static void ipmmu_irq(int irq, void *dev, struct cpu_user_regs *regs)
+{
+    struct ipmmu_vmsa_device *mmu = dev;
+    unsigned int i;
+    unsigned long flags;
+
+    spin_lock_irqsave(&mmu->lock, flags);
+
+    /*
+     * When interrupt arrives, we don't know the context it is related to.
+     * So, check interrupts for all active contexts to locate a context
+     * with status bits set.
+    */
+    for ( i = 0; i < mmu->num_ctx; i++ )
+    {
+        if ( !mmu->domains[i] )
+            continue;
+        ipmmu_domain_irq(mmu->domains[i]);
+    }
+
+    spin_unlock_irqrestore(&mmu->lock, flags);
+}
+
+/* Master devices management */
+static int ipmmu_attach_device(struct ipmmu_vmsa_domain *domain,
+                               struct device *dev)
+{
+    struct ipmmu_vmsa_master_cfg *cfg = dev_archdata(dev)->cfg;
+    struct ipmmu_vmsa_device *mmu = cfg->mmu;
+    unsigned int i;
+
+    if ( !mmu )
+    {
+        dev_err(dev, "Cannot attach to IPMMU\n");
+        return -ENXIO;
+    }
+
+    if ( !domain->mmu )

So you read domain->mmu here and ...

+    {
+        /* The domain hasn't been used yet, initialize it. */
+        domain->mmu = mmu;
+
+        /*
+         * We have already enabled context for Root IPMMU assigned to this
+         * Xen domain in ipmmu_domain_init_context().
+         * Enable the context for Cache IPMMU only. Flush the TLB as required
+         * when modifying the context registers.
+         */
+        ipmmu_ctx_write_cache(domain, IMCTR,
+                              ipmmu_ctx_read_root(domain, IMCTR) | 
IMCTR_FLUSH);
+
+        dev_info(dev, "Using IPMMU context %u\n", domain->context_id);
+    }
+    else if ( domain->mmu != mmu )

... here. What actually promise that domain->mmu can't change in parallel?

+    {
+        /*
+         * Something is wrong, we can't attach two master devices using
+         * different IOMMUs to the same IPMMU domain.
+         */
+        dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
+                dev_name(mmu->dev), dev_name(domain->mmu->dev));
+        return -EINVAL;
+    }
+    else
+        dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
+
+    for ( i = 0; i < cfg->num_utlbs; ++i )
+        ipmmu_utlb_enable(domain, cfg->utlbs[i]);
+
+    return 0;
+}

[...]

+static void ipmmu_protect_masters(struct ipmmu_vmsa_device *mmu)
+{
+    struct dt_device_node *node;
+
+    dt_for_each_device_node( dt_host, node )
+    {
+        if ( mmu->dev->of_node != dt_parse_phandle(node, "iommus", 0) )
+            continue;
+
+        /* Let Xen know that the master device is protected by an IOMMU. */
+        dt_device_set_protected(node);
+
+        dev_info(mmu->dev, "Found master device %s\n", 
dt_node_full_name(node));
+    }
+}

I don't much like this. You are going to go through the whole DT quite a few times.

The IOMMU interface in Xen has not been designed with the new IOMMU bindings in mind. I would prefer if we look for extending add_device callback to support platform device.

This would allow to probe the device later on and therefore avoid to go through the device-tree multiple.

+
+static void ipmmu_device_reset(struct ipmmu_vmsa_device *mmu)
+{
+    unsigned int i;
+
+    /* Disable all contexts. */
+    for ( i = 0; i < mmu->num_ctx; ++i )
+        ipmmu_write(mmu, i * IM_CTX_SIZE + IMCTR, 0);
+}
+
+/*
+ * This function relies on the fact that Root IPMMU device is being probed
+ * the first. If not the case, it denies further Cache IPMMU device probes
+ * (returns the -ENODEV) until the Root IPMMU device has been registered
+ * for sure.

Can we look at handling -EDEFER in Xen instead?

+ */

[...]

+static int __must_check ipmmu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
+                                       unsigned int flags,
+                                       unsigned int *flush_flags)

The function is exactly the same as for the SMMU driver. Could we implement common helpers in a separate file?

+{
+    p2m_type_t t;
+
+    /*
+     * Grant mappings can be used for DMA requests. The dev_bus_addr
+     * returned by the hypercall is the MFN (not the IPA). For device
+     * protected by an IOMMU, Xen needs to add a 1:1 mapping in the domain
+     * p2m to allow DMA request to work.
+     * This is only valid when the domain is directed mapped. Hence this
+     * function should only be used by gnttab code with gfn == mfn == dfn.
+     */
+    BUG_ON(!is_domain_direct_mapped(d));
+    BUG_ON(mfn_x(mfn) != dfn_x(dfn));
+
+    /* We only support readable and writable flags */
+    if ( !(flags & (IOMMUF_readable | IOMMUF_writable)) )
+        return -EINVAL;
+
+    t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw : p2m_iommu_map_ro;
+
+    /*
+     * The function guest_physmap_add_entry replaces the current mapping
+     * if there is already one...
+     */
+    return guest_physmap_add_entry(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)), 0, 
t);
+}
+
+static int __must_check ipmmu_unmap_page(struct domain *d, dfn_t dfn,
+                                         unsigned int *flush_flags)

Same here.

+{
+    /*
+     * This function should only be used by gnttab code when the domain
+     * is direct mapped (i.e. gfn == mfn == dfn).
+     */
+    if ( !is_domain_direct_mapped(d) )
+        return -EINVAL;
+
+    return guest_physmap_remove_page(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)), 0);
+}

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.