[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 01/15] xen/common: add cache coloring common code



Hi Jan,

On Thu, Feb 1, 2024 at 1:59 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 29.01.2024 18:17, Carlo Nonato wrote:
> > --- a/xen/arch/Kconfig
> > +++ b/xen/arch/Kconfig
> > @@ -31,3 +31,20 @@ config NR_NUMA_NODES
> >         associated with multiple-nodes management. It is the upper bound of
> >         the number of NUMA nodes that the scheduler, memory allocation and
> >         other NUMA-aware components can handle.
> > +
> > +config LLC_COLORING
> > +     bool "Last Level Cache (LLC) coloring" if EXPERT
> > +     depends on HAS_LLC_COLORING
> > +
> > +config NR_LLC_COLORS
> > +     int "Maximum number of LLC colors"
> > +     range 2 1024
>
> What's the reasoning behind this upper bound? IOW - can something to this
> effect be said in the description, please?

The only reason is that this is the number of colors that fit in a 4 KiB page.
I don't have any other good way of picking a number here. 1024 is already big
and probably nobody would use such a configuration. But 512 or 256 would be
equally arbitrary.

> > +     default 128
> > +     depends on LLC_COLORING
> > +     help
> > +       Controls the build-time size of various arrays associated with LLC
> > +       coloring. Refer to cache coloring documentation for how to compute 
> > the
> > +       number of colors supported by the platform. This is only an upper
> > +       bound. The runtime value is autocomputed or manually set via 
> > cmdline.
> > +       The default value corresponds to an 8 MiB 16-ways LLC, which should 
> > be
> > +       more than what needed in the general case.
>
> Aiui while not outright wrong, non-power-of-2 values are meaningless to
> specify. Perhaps that is worth mentioning (if not making this a value
> that's used as exponent of 2 in the first place)?

Yes, I prefer a better help message.

> As to the default and its description: As said for the documentation,
> doesn't what this corresponds to also depend on cache line size? Even
> if this was still Arm-specific rather than common code, I'd question
> whether now and forever Arm chips may only use one pre-determined cache
> line size.

I hope I answered in the previous mail why the line size (in the specific case
we are applying coloring to) can be ignored as a parameter in favor of cache
size and number of ways.

> > --- /dev/null
> > +++ b/xen/common/llc-coloring.c
> > @@ -0,0 +1,87 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Last Level Cache (LLC) coloring common code
> > + *
> > + * Copyright (C) 2022 Xilinx Inc.
> > + */
> > +#include <xen/keyhandler.h>
> > +#include <xen/llc-coloring.h>
> > +#include <xen/param.h>
> > +
> > +bool __ro_after_init llc_coloring_enabled;
> > +boolean_param("llc-coloring", llc_coloring_enabled);
>
> The variable has no use right now afaics, so it's unclear whether (a) it
> is legitimately non-static and (b) placed in an appropriate section.

My bad here. The variable should be tested for in llc_coloring_init() and in
domain_dump_llc_colors() (in domain_llc_coloring_free() as well, in later
patches). That change was lost in the rebase of the series.

Anyway per this patch, the global is only accessed from this file while it's
going to be accessed from outside in later patches. In this case what should
I do? Declare it static and then make it non-static afterwards?

> > +/* Size of an LLC way */
> > +static unsigned int __ro_after_init llc_way_size;
> > +size_param("llc-way-size", llc_way_size);
> > +/* Number of colors available in the LLC */
> > +static unsigned int __ro_after_init max_nr_colors = CONFIG_NR_LLC_COLORS;
> > +
> > +static void print_colors(const unsigned int *colors, unsigned int 
> > num_colors)
> > +{
> > +    unsigned int i;
> > +
> > +    printk("{ ");
> > +    for ( i = 0; i < num_colors; i++ ) {
>
> Nit (style): Brace placement.
>
> > +        unsigned int start = colors[i], end = colors[i];
> > +
> > +        printk("%u", start);
> > +
> > +        for ( ;
> > +              i < num_colors - 1 && colors[i] + 1 == colors[i + 1];
>
> To reduce the number of array accesses, may I suggest to use "end + 1"
> here instead of "colors[i] + 1"? (The initializer of "end" could also
> be "start", but I guess the compiler will recognize this anyway.) This
> would then (imo) also better justify the desire for having "end" in
> the first place.
>
> > +              i++, end++ );
>
> Imo for clarity the semicolon want to live on its own line.
>
> > +static void dump_coloring_info(unsigned char key)
>
> This being common code now, I think it would be good practice to have
> cf_check here right away, even if for now (for whatever reason) the
> feature is meant to be limited to Arm. (Albeit see below for whether
> this is to remain that way.)
>
> > +void __init llc_coloring_init(void)
> > +{
> > +    if ( !llc_way_size && !(llc_way_size = get_llc_way_size()) )
> > +        panic("Probed LLC coloring way size is 0 and no custom value 
> > found\n");
> > +
> > +    /*
> > +     * The maximum number of colors must be a power of 2 in order to 
> > correctly
> > +     * map them to bits of an address, so also the LLC way size must be so.
> > +     */
> > +    if ( llc_way_size & (llc_way_size - 1) )
> > +        panic("LLC coloring way size (%u) isn't a power of 2\n", 
> > llc_way_size);
> > +
> > +    max_nr_colors = llc_way_size >> PAGE_SHIFT;
>
> With this unconditionally initialized here, what's the purpose of the
> variable's initializer?

Previously I was using the global in parse_color_config() (later introduced),
but since now I'm not doing it anymore I can drop the initializer.

> > +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> > +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> > +              max_nr_colors, CONFIG_NR_LLC_COLORS);
>
> I'm not convinced of panic()ing here (including the earlier two
> instances). You could warn, taint, disable, and continue. If you want
> to stick to panic(), please justify doing so in the description.
>
> Plus, if you panic(), shouldn't that be limited to llc_coloring_enabled
> being true? Or - not visible here, due to the lack of a caller of the
> function - is that meant to be taken care of by the caller (to not call
> here when the flag is off)? I think it would be cleaner if the check
> lived here; quite possibly that would then further permit the flag
> variable to become static.

You're right. As I said here the check on llc_coloring_enabled is missing.
Obviously it's an error doing the initialization no matter what.

> > +    register_keyhandler('K', dump_coloring_info, "dump LLC coloring info", 
> > 1);
>
> I'm also not convinced of using a separate key for this little bit of
> information. How about attaching this to what 'm' or 'H' produce?

Ok. 'm' seems the right place.

> > +    arch_llc_coloring_init();
> > +}
> > +
> > +void domain_dump_llc_colors(const struct domain *d)
> > +{
> > +    printk("Domain %pd has %u LLC colors: ", d, d->num_llc_colors);
>
> %pd resolves to d<N> - why "Domain" as a prefix? And really - why the
> domain identifier in the first place? All surrounding information is
> already for this very domain.
>
> > +    print_colors(d->llc_colors, d->num_llc_colors);
>
> Imo this (or perhaps even the entire function) wants skipping when
> num_llc_colors is zero, which would in particular also cover the
> !llc_coloring_enabled case.

This shouldn't be possible. As I said this function should be a no-op when
!llc_coloring_enabled.

Thanks.

> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -626,6 +626,11 @@ struct domain
> >
> >      /* Holding CDF_* constant. Internal flags for domain creation. */
> >      unsigned int cdf;
> > +
> > +#ifdef CONFIG_LLC_COLORING
> > +    unsigned const int *llc_colors;
>
> const unsigned int * please.
>
> Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.