[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table



Hi,

On 26/10/16 02:10, Stefano Stabellini wrote:
> Hi Andre,
> 
> Sorry for the late reply, I'll try to be faster for the next rounds of
> review. The patch looks good for a first iteration. Some comments below.

No worries and thanks for the thorough review, much appreciated.
As you can see I took my time to respond as well ;-)

> 
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> ITS. We limit the number of LPIs we use with a compile time constant to
>> avoid wasting memory.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
>> ---
>>  xen/arch/arm/Kconfig              |  6 ++++
>>  xen/arch/arm/efi/efi-boot.h       |  1 -
>>  xen/arch/arm/gic-its.c            | 76 
>> +++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>>  xen/include/asm-arm/cache.h       |  4 +++
>>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>>  7 files changed, 181 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 9fe3b8e..66e2bb8 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -50,6 +50,12 @@ config HAS_ITS
>>          depends on ARM_64
>>          depends on HAS_GICV3
>>  
>> +config HOST_LPI_BITS
>> +        depends on HAS_ITS
>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>> +        range 14 32
>> +        default "20"
>> +
>>  config ALTERNATIVE
>>      bool
>>  
>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>> index 045d6ce..dc64aec 100644
>> --- a/xen/arch/arm/efi/efi-boot.h
>> +++ b/xen/arch/arm/efi/efi-boot.h
>> @@ -10,7 +10,6 @@
>>  #include "efi-dom0.h"
>>  
>>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
>> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>>  
>>  #define DEVICE_TREE_GUID \
>>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 
>> 0xe0}}
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 0f42a77..b52dff3 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -20,10 +20,86 @@
>>  #include <xen/lib.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>> +#include <asm/p2m.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>  
>> +/* Global state */
>> +static struct {
>> +    uint8_t *lpi_property;
>> +    int host_lpi_bits;
>> +} lpi_data;
>> +
>> +/* Pending table for each redistributor */
>> +static DEFINE_PER_CPU(void *, pending_table);
>> +
>> +#define MAX_HOST_LPI_BITS                                                \
> 
> To avoid confusion, I would call this MAX_PHYS_LPI_BITS
> 
> 
>> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> 
> And this MAX_PHYS_LPIS

Done.

>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    uint64_t reg, attr;
>> +    void *pendtable;
> 
> I would introduce a check to make sure that this_cpu(pending_table) == NULL.

Can do. So I return back this value then, though this should never happen.

> 
>> +    attr  = GIC_BASER_CACHE_RaWaWb << 
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << 
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
>> +
>> +    /*
>> +     * The pending table holds one bit per LPI, so we need three bits less
>> +     * than the number of LPI_BITs.
> 
> Why 3 bit less? Please add more info on how you came up with 3.

3 bits as in 2 << 3 = 8 = BITS_PER_BYTES. We need to divide by that,
which is shift by 3, which is ORDER - 3. Does that make sense?
But this mayhem goes away anyway with _xmalloc.

> 
>>         But the alignment requirement from the
>> +     * ITS is 64K, so make order at least 16 (-12).
> 
> Does it need to be 64K aligned or does it need to be at least 64K in
> size?

The first.

> That makes a big difference. If it just needs to be 64K aligned,
> you can do that with xmalloc.

Well, not xmalloc (since I don't have a data structure of that size),
but _xmalloc. I just saw that this is exported as well (I dismissed this
before because of the leading underscore).
Also "alloc pages" sounded more like what I had in mind, but I guess
aligning it to 64K serves the same purpose.

>> +     */
>> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 
>> 12, 0);
> 
> Shouldn't we be using MAX_HOST_LPI_BITS instead of
> lpi_data.host_lpi_bits to make this calculation?

I was under the impression that the redistributors expect the pending
table to cover every possible LPI as reported in GICD_TYPER (because in
contrast to PROPBASER the PENDBASER register lacks a size field).
But thinking about this again this seems to be insane, since 32 bit
worth of LPIs would lead to a 0.5GB pending table. But as the LPI
numbers are under the control of software, we can go with allocating
less - up to our internal limit - which is also what Linux does.

> 
>> +    if ( !pendtable )
>> +        return 0;
>> +
>> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
> 
> flush_dcache?

Uhm, yes.

> 
>> +    this_cpu(pending_table) = pendtable;
>> +
>> +    reg  = attr | GICR_PENDBASER_PTZ;
>> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
>> +
>> +    return reg;
>> +}
>> +
>> +uint64_t gicv3_lpi_get_proptable()
>> +{
>> +    uint64_t attr;
>> +    static uint64_t reg = 0;
>> +
>> +    /* The property table is shared across all redistributors. */
>> +    if ( reg )
>> +        return reg;
> 
> Can't you just use lpi_data.lpi_property != NULL instead of introducing
> a new static local variable?

Seems like a good idea actually. We have to reconstruct the register
content, but that seems doable.

>> +    attr  = GIC_BASER_CACHE_RaWaWb << 
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << 
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
>> +
>> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
> 
> Please add a comment on how the order is calculated.

Does " ... - PAGE_SHIFT" suffice?

> 
> 
>> +    if ( !lpi_data.lpi_property )
>> +        return 0;
>> +
>> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, 
>> MAX_HOST_LPIS);
>> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
>> +
>> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
>> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
>> +
>> +    return reg;
>> +}
>> +
>> +int gicv3_lpi_init_host_lpis(int lpi_bits)
>> +{
>> +    lpi_data.host_lpi_bits = lpi_bits;
>> +
>> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
>> +
>> +    return 0;
>> +}
>> +
>>  void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>      const struct dt_device_node *its = NULL;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 238da84..2534aa5 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>>      type = readl_relaxed(GICD + GICD_TYPER);
>>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>>  
>> +    if ( type & GICD_TYPE_LPIS )
>> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) 
>> + 1);
> 
> Please #define a mask instead of using 0x1f
> 
> 
>> +
>>      printk("GICv3: %d lines, (IID %8.8x).\n",
>>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>>  
>> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>>  
>>      return 0;
>>  }
>> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>> +{
>> +    uint32_t reg;
>> +    uint64_t table_reg;
>> +
>> +    if ( list_empty(&host_its_list) )
>> +        return;
>> +
>> +    /* Make sure LPIs are disabled before setting up the BASERs. */
>> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
>> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
>> +
>> +    table_reg = gicv3_lpi_allocate_pendtable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> 
> Maybe we want to return in case table_reg == NULL ?

I guess so. I just wonder what we would do in this case? Panic?
Theoretically we could just proceed without enabling LPIs on this
redistributor, but that's probably not what a user would expect.


Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.