[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table



Hi Julien,

On 01/11/16 17:22, Julien Grall wrote:
> Hi Andre,
> 
> On 28/09/2016 19:24, Andre Przywara wrote:
>> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> ITS. We limit the number of LPIs we use with a compile time constant to
>> avoid wasting memory.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
>> ---
>>  xen/arch/arm/Kconfig              |  6 ++++
>>  xen/arch/arm/efi/efi-boot.h       |  1 -
>>  xen/arch/arm/gic-its.c            | 76
>> +++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>>  xen/include/asm-arm/cache.h       |  4 +++
>>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>>  7 files changed, 181 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 9fe3b8e..66e2bb8 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -50,6 +50,12 @@ config HAS_ITS
>>          depends on ARM_64
>>          depends on HAS_GICV3
>>
>> +config HOST_LPI_BITS
>> +        depends on HAS_ITS
>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>> +        range 14 32
>> +        default "20"
>> +
> 
> This would be better to be defined as a parameter command line. So the
> user does not need to rebuild Xen in order to increase the number of
> bits supported. It would also be useful to get a rational behind the
> default number in the commit message.

Yeah, a command line option sounds useful, though I have to check
whether this changes compile-time computation into actual runtime one.

The number is made-up, based on some reasoning on the possible memory
consumption and the number of LPIs provided. 8 MB and 1 million LPIs
sounded like a sweet spot.
If I got this correctly, Linux atm doesn't use more than 65536 LPIs.

>>  config ALTERNATIVE
>>      bool
>>
>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>> index 045d6ce..dc64aec 100644
>> --- a/xen/arch/arm/efi/efi-boot.h
>> +++ b/xen/arch/arm/efi/efi-boot.h
>> @@ -10,7 +10,6 @@
>>  #include "efi-dom0.h"
>>
>>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
>> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>>
>>  #define DEVICE_TREE_GUID \
>>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69,
>> 0xaa, 0xe0}}
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 0f42a77..b52dff3 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
> 
> Please rename this file gic-v3-its to make clear ITS is only GICv3.

OK.

>> @@ -20,10 +20,86 @@
>>  #include <xen/lib.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>> +#include <asm/p2m.h>
> 
> Why did you include p2m.h? This header contains stage-2 page table
> functions but I don't see any use of them within this patch.

This may be a rebase artifact introduced when shuffling patches around.

>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>
>> +/* Global state */
>> +static struct {
>> +    uint8_t *lpi_property;
>> +    int host_lpi_bits;
> 
> Please use unsigned int.
> 
>> +} lpi_data;
>> +
>> +/* Pending table for each redistributor */
>> +static DEFINE_PER_CPU(void *, pending_table);
>> +
>> +#define
>> MAX_HOST_LPI_BITS                                                \
>> +        min_t(unsigned int, lpi_data.host_lpi_bits,
>> CONFIG_HOST_LPI_BITS)
> 
> Why don't you directly initialize host_lpi_bits to the correct value?
> This would avoid to compute the min every time you use MAX_HOST_LPI_BITS
> and save few instructions.

Mmmh, probably because of forest and trees and stuff ;-)
Looks indeed like being pointless.

>> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> 
> I know that we don't support ITS for 32-bits, but I would rather avoid
> to use BIT as this macro is working on unsigned long. I would prefer if
> you introduce BIT_ULL or open-code.

Sure.

>> +
>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    uint64_t reg, attr;
>> +    void *pendtable;
>> +
>> +    attr  = GIC_BASER_CACHE_RaWaWb <<
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner <<
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable <<
>> GICR_PENDBASER_SHAREABILITY_SHIFT;
> 
> From the spec (8.11.18 in ARM IHI 0069C) the cacheability and
> shareability could be fixed (though it marked as deprecated). Should we
> check whether the value stick?

Yeah, I guess we have to. I was just not sure what to do when it
doesn't. By design we don't touch the pending table after having it
handed in, so technically we wouldn't even need to map it. But we have
to make sure nobody steps on it and that it stays reserved.

> Also the variable attr sounds pointless as you will directly assign the
> value to reg with no more computation.

Yes, this is probably copy & pasted from a place where it needed a loop.
I just kept it because of readability. Can surely clean this up.

>> +
>> +    /*
>> +     * The pending table holds one bit per LPI, so we need three bits
>> less
>> +     * than the number of LPI_BITs. But the alignment requirement
>> from the
>> +     * ITS is 64K, so make order at least 16 (-12).
>> +     */
>> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3,
>> 16) - 12, 0);
>> +    if ( !pendtable )
>> +        return 0;
>> +
>> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
> 
> Same remark for BIT here.
> 
>> +    this_cpu(pending_table) = pendtable;
>> +
>> +    reg  = attr | GICR_PENDBASER_PTZ;
>> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
> 
> I don't think the mask is useful and would need to be changed if the
> physical address bits increased as it was done in ARMv8.2.

Mmmh, not so sure we can extend the mask to cover the region that it
RES0 at the moment. We need some mask anyway (to not clobber the upper
bits), so I figured we just use what the spec says.

>> +
>> +    return reg;
>> +}
>> +
>> +uint64_t gicv3_lpi_get_proptable()
>> +{
>> +    uint64_t attr;
>> +    static uint64_t reg = 0;
>> +
>> +    /* The property table is shared across all redistributors. */
>> +    if ( reg )
>> +        return reg;
>> +
>> +    attr  = GIC_BASER_CACHE_RaWaWb <<
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner <<
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable <<
>> GICR_PENDBASER_SHAREABILITY_SHIFT;
> 
> Same question for the cacheability and shareability.
> 
> Also the variable attr sounds pointless as you will directly assign the
> value to reg with no more computation.
> 
>> +
>> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS -
>> 12, 0);
>> +    if ( !lpi_data.lpi_property )
>> +        return 0;
>> +
>> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1,
>> MAX_HOST_LPIS);
>> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
>> +
>> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
>> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
> 
> Same remark for the mask here.
> 
>> +
>> +    return reg;
>> +}
>> +
>> +int gicv3_lpi_init_host_lpis(int lpi_bits)
> 
> Please use unsigned int for lpi_bits.
> 
> Also this function should probably be in the section __init.
> 
>> +{
>> +    lpi_data.host_lpi_bits = lpi_bits;
>> +
>> +    printk("GICv3: using at most %ld LPIs on the host.\n",
>> MAX_HOST_LPIS);
> 
> %lu.
> 
>> +
>> +    return 0;
>> +}
>> +
>>  void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>      const struct dt_device_node *its = NULL;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 238da84..2534aa5 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>>      type = readl_relaxed(GICD + GICD_TYPER);
>>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>>
>> +    if ( type & GICD_TYPE_LPIS )
>> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) &
>> 0x1f) + 1);
>> +
>>      printk("GICv3: %d lines, (IID %8.8x).\n",
>>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>>
>> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>>
>>      return 0;
>>  }
> 
> Missing blank line here.
> 
>> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>> +{
>> +    uint32_t reg;
>> +    uint64_t table_reg;
>> +
>> +    if ( list_empty(&host_its_list) )
>> +        return;
>> +
>> +    /* Make sure LPIs are disabled before setting up the BASERs. */
>> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
>> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base +
>> GICR_CTLR);
>> +
>> +    table_reg = gicv3_lpi_allocate_pendtable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> 
> Is it fine to continue silently if gicv3_lpi_allocate_pendtable has failed?

Probably not, I have fixed this already as Stefano pointed out the same.

>> +
>> +    table_reg = gicv3_lpi_get_proptable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> 
> Ditto.
> 
> 
>> +}
>>
>>  static int __init gicv3_populate_rdist(void)
>>  {
>> @@ -658,6 +681,10 @@ static int __init gicv3_populate_rdist(void)
>>              if ( (typer >> 32) == aff )
>>              {
>>                  this_cpu(rbase) = ptr;
>> +
>> +                if ( typer & GICR_TYPER_PLPIS )
>> +                    gicv3_rdist_init_lpis(ptr);
>> +
>>                  printk("GICv3: CPU%d: Found redistributor in region
>> %d @%p\n",
>>                          smp_processor_id(), i, ptr);
>>                  return 0;
>> diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
>> index 2de6564..af96eee 100644
>> --- a/xen/include/asm-arm/cache.h
>> +++ b/xen/include/asm-arm/cache.h
>> @@ -7,6 +7,10 @@
>>  #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
>>  #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
>>
>> +#ifndef __ASSEMBLY__
>> +void __flush_dcache_area(const void *vaddr, unsigned long size);
>> +#endif
>> +
> 
> Please move this change in a separate patch.
> 
>>  #define __read_mostly __section(".data.read_mostly")
>>
>>  #endif
>> diff --git a/xen/include/asm-arm/gic-its.h
>> b/xen/include/asm-arm/gic-its.h
>> index 2f5c51c..48c6c78 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -36,12 +36,32 @@ extern struct list_head host_its_list;
>>  /* Parse the host DT and pick up all host ITSes. */
>>  void gicv3_its_dt_init(const struct dt_device_node *node);
>>
>> +/* Allocate and initialize tables for each host redistributor.
>> + * Returns the respective {PROP,PEND}BASER register value.
>> + */
>> +uint64_t gicv3_lpi_get_proptable(void);
>> +uint64_t gicv3_lpi_allocate_pendtable(void);
>> +
>> +/* Initialize the host structures for LPIs. */
>> +int gicv3_lpi_init_host_lpis(int nr_lpis);
>> +
>>  #else
>>
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>  }
>> -
> 
> Please add a newline here.
> 
>> +static inline uint64_t gicv3_lpi_get_proptable(void)
>> +{
>> +    return 0;
>> +}
> 
> Ditto
> 
>> +static inline uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    return 0;
>> +}
> 
> Ditto
> 
>> +static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
>> +{
>> +    return 0;
>> +}
> 
> Ditto
> 
>>  #endif /* CONFIG_HAS_ITS */
>>
>>  #endif /* __ASSEMBLY__ */
>> diff --git a/xen/include/asm-arm/gic_v3_defs.h
>> b/xen/include/asm-arm/gic_v3_defs.h
>> index 6bd25a5..da5fb77 100644
>> --- a/xen/include/asm-arm/gic_v3_defs.h
>> +++ b/xen/include/asm-arm/gic_v3_defs.h
>> @@ -44,7 +44,8 @@
>>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>>
>>  /* Additional bits in GICD_TYPER defined by GICv3 */
>> -#define GICD_TYPE_ID_BITS_SHIFT 19
>> +#define GICD_TYPE_ID_BITS_SHIFT      19
>> +#define GICD_TYPE_LPIS               (1U << 17)
> 
> I was about to say that this should be named GICD_TYPER... but it looks
> like we already defined and use GIC_TYPE_ID_BITS_SHIFTS. So it is up to
> you if you rename it to get the correct register name.

Yeah, I was unsure about this as well. My hunch is we should avoid the
churn and keep existing names around. Experience shows that those simple
renames tend to introduce nasty rebase issues, especially with a
long-standing series like this.

>>
>>  #define GICD_CTLR_RWP                (1UL << 31)
>>  #define GICD_CTLR_ARE_NS             (1U << 4)
>> @@ -95,12 +96,57 @@
>>  #define GICR_IGRPMODR0               (0x0D00)
>>  #define GICR_NSACR                   (0x0E00)
>>
>> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
> 
> Please add a new line here to separate definition for GICR_CTLR and
> GICR_TYPER.
> 
>>  #define GICR_TYPER_PLPIS             (1U << 0)
>>  #define GICR_TYPER_VLPIS             (1U << 1)
>>  #define GICR_TYPER_LAST              (1U << 4)
>>
>> +#define GIC_BASER_CACHE_nCnB         0ULL
>> +#define GIC_BASER_CACHE_SameAsInner  0ULL
> 
> I think this would require some description in the code as it is not
> clear wheather nCnB apply for Outer or Inner. From my understanding it
> is only the later.
> 
>> +#define GIC_BASER_CACHE_nC           1ULL
>> +#define GIC_BASER_CACHE_RaWt         2ULL
>> +#define GIC_BASER_CACHE_RaWb         3ULL
>> +#define GIC_BASER_CACHE_WaWt         4ULL
>> +#define GIC_BASER_CACHE_WaWb         5ULL
>> +#define GIC_BASER_CACHE_RaWaWt       6ULL
>> +#define GIC_BASER_CACHE_RaWaWb       7ULL
>> +#define GIC_BASER_CACHE_MASK         7ULL
> 
> New line here please and maybe a comment to say this is shareability
> definition.
> 
>> +#define GIC_BASER_NonShareable       0ULL
>> +#define GIC_BASER_InnerShareable     1ULL
>> +#define GIC_BASER_OuterShareable     2ULL
>> +
>> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
>> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
> 
>> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
>> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
>> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
> 
> It might be better to define GIC_BASER_SHAREABILITY_MASK rather than
> open-coding 3UL. Also technically 3UL should be 3ULL.

I really think its hard to read already, naming this also breaks 80
characters, I believe (I think I tried it).
So while I appreciate the habit of naming magic constants, I wonder if
this is really going bonkers here. After all we define a mask here, so
_somewhere_ these actual numbers would need to end up, wouldn't they?

>> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
>> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
> 
> Same remark here.
> 
>> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
>> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
> 
> Ditto
> 
>> +#define PROPBASER_RES0_MASK                                  \
> 
> I would probably rename this field GICR_PROPBASER_RES0_MASK.
> 
>> +        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
>> +
>> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
>> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
>> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
>> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
>> +    (3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
> 
> See my remark above.
> 
>> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
>> +    (7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
> 
> Ditto
> 
>> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
>> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
>> +#define GICR_PENDBASER_PTZ                              BIT(62)
> 
> Please don't use BIT but either 1ULL << 62 or introduce BIT_ULL.
> 
>> +#define PENDBASER_RES0_MASK                                  \
> 
> GICR_PENDBASER_RES0_MASK
> 
>> +        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
>> +         GENMASK(15, 12) | GENMASK(6, 0))
>> +
>>  #define DEFAULT_PMR_VALUE            0xff
>>
>> +#define LPI_PROP_DEFAULT_PRIO        0xa0
> 
> You define LPI_PROP_DEFAULT_PRIO but never used it within this series.
> In any case, it would be better to keep using GIC_PRI_IRQ (as you did)
> as make LPI_PROP_DEFAULT_PRIO an alias of GIC_PRI_IRQ to avoid spreading
> the priority everywhere (for now they are all defined in gic.h).

Sure.
(And basically "yes, will fix" to anything I haven't replied on
explictly above).

Cheers,
Andre.
> 
>> +#define LPI_PROP_RES1                (1 << 1)
>> +#define LPI_PROP_ENABLED             (1 << 0)
>> +
>>  #define GICH_VMCR_EOI                (1 << 9)
>>  #define GICH_VMCR_VENG1              (1 << 1)
>>
>>
> 
> Regards,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.