[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 10/15] xen: add cache coloring allocator for domains



Hi Jan,

On Thu, Feb 1, 2024 at 4:53 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 29.01.2024 18:18, Carlo Nonato wrote:
> > @@ -157,7 +158,11 @@
> >  #define PGC_static 0
> >  #endif
> >
> > -#define PGC_preserved (PGC_extra | PGC_static)
> > +#ifndef PGC_colored
> > +#define PGC_colored 0
> > +#endif
> > +
> > +#define PGC_preserved (PGC_extra | PGC_static | PGC_colored)
> >
> >  #ifndef PGT_TYPE_INFO_INITIALIZER
> >  #define PGT_TYPE_INFO_INITIALIZER 0
> > @@ -1504,6 +1509,7 @@ static void free_heap_pages(
> >              if ( !mfn_valid(page_to_mfn(predecessor)) ||
> >                   !page_state_is(predecessor, free) ||
> >                   (predecessor->count_info & PGC_static) ||
> > +                 (predecessor->count_info & PGC_colored) ||
>
> How about a 2nd #define (e.g. PGC_no_buddy_merge) to use here and ...
>
> >                   (PFN_ORDER(predecessor) != order) ||
> >                   (page_to_nid(predecessor) != node) )
> >                  break;
> > @@ -1528,6 +1534,7 @@ static void free_heap_pages(
> >              if ( !mfn_valid(page_to_mfn(successor)) ||
> >                   !page_state_is(successor, free) ||
> >                   (successor->count_info & PGC_static) ||
> > +                 (successor->count_info & PGC_colored) ||
>
> ... here? That'll then also avoid me commenting that I don't see why
> these two PGC_* checks aren't folded.

Yes for me it's ok (I even suggested that in v5). Do you want this change to
be merged with the previous patch? Or should they all belong to this one?

> > @@ -1943,6 +1950,161 @@ static unsigned long avail_heap_pages(
> >      return free_pages;
> >  }
> >
> > +/*************************
> > + * COLORED SIDE-ALLOCATOR
> > + *
> > + * Pages are grouped by LLC color in lists which are globally referred to 
> > as the
> > + * color heap. Lists are populated in end_boot_allocator().
> > + * After initialization there will be N lists where N is the number of
> > + * available colors on the platform.
> > + */
> > +static struct page_list_head *__ro_after_init _color_heap;
> > +static unsigned long *__ro_after_init free_colored_pages;
> > +
> > +/* Memory required for buddy allocator to work with colored one */
> > +#ifdef CONFIG_LLC_COLORING
> > +static unsigned long __initdata buddy_alloc_size =
> > +    MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
> > +size_param("buddy-alloc-size", buddy_alloc_size);
> > +
> > +#define domain_num_llc_colors(d) (d)->num_llc_colors
> > +#define domain_llc_color(d, i)   (d)->llc_colors[(i)]
>
> Nit: Unnecessary parentheses inside the square brackets.
>
> > +#else
> > +static unsigned long __initdata buddy_alloc_size;
> > +
> > +#define domain_num_llc_colors(d) 0
> > +#define domain_llc_color(d, i)   0
> > +#endif
> > +
> > +#define color_heap(color) (&_color_heap[color])
>
> Wouldn't this better live next to _color_heap()'s definition?
>
> > +static void free_color_heap_page(struct page_info *pg, bool need_scrub)
> > +{
> > +    unsigned int color = page_to_llc_color(pg);
> > +    struct page_list_head *head = color_heap(color);
> > +
> > +    spin_lock(&heap_lock);
> > +
> > +    mark_page_free(pg, page_to_mfn(pg));
> > +
> > +    if ( need_scrub )
> > +    {
> > +        pg->count_info |= PGC_need_scrub;
> > +        poison_one_page(pg);
> > +    }
> > +
> > +    pg->count_info |= PGC_colored;
>
> The page transiently losing PGC_colored is not an issue then (presumably
> because of holding the heap lock)? Did you consider having mark_page_free()
> also use PGC_preserved?

I did something similar to what it's done in unprepare_staticmem_pages():
first mark_page_free() is called and then PGC_static is added to count_info.

> > +    free_colored_pages[color]++;
> > +    page_list_add(pg, head);
> > +
> > +    spin_unlock(&heap_lock);
> > +}
> > +
> > +static struct page_info *alloc_color_heap_page(unsigned int memflags,
> > +                                               const struct domain *d)
> > +{
> > +    struct page_info *pg = NULL;
> > +    unsigned int i, color = 0;
> > +    unsigned long max = 0;
> > +    bool need_tlbflush = false;
> > +    uint32_t tlbflush_timestamp = 0;
> > +    bool scrub = !(memflags & MEMF_no_scrub);
> > +
> > +    spin_lock(&heap_lock);
> > +
> > +    for ( i = 0; i < domain_num_llc_colors(d); i++ )
> > +    {
> > +        unsigned long free = free_colored_pages[domain_llc_color(d, i)];
> > +
> > +        if ( free > max )
> > +        {
> > +            color = domain_llc_color(d, i);
> > +            pg = page_list_first(color_heap(color));
> > +            max = free;
> > +        }
> > +    }
>
> The apporach is likely fine at least initially, but: By going from where
> the most free pages are, you're not necessarily evenly distributing a
> domain's pages over the colors it may use, unless the domain uses its
> set of colors exclusively.

We don't find it problematic since the exclusive set of colors are to be
preferred (as per the documentation).

> > +    if ( !pg )
> > +    {
> > +        spin_unlock(&heap_lock);
> > +        return NULL;
> > +    }
> > +
> > +    pg->count_info = PGC_state_inuse | PGC_colored |
> > +                     (pg->count_info & PGC_need_scrub);
>
> Isn't PGC_colored already set on the page?

Yes but here I need to make sure that only PGC_state_inuse | PGC_colored are
used so an assignment seems legit.

> Together with ...
>
> > +    free_colored_pages[color]--;
> > +    page_list_del(pg, color_heap(color));
> > +
> > +    if ( !(memflags & MEMF_no_tlbflush) )
> > +        accumulate_tlbflush(&need_tlbflush, pg, &tlbflush_timestamp);
> > +
> > +    init_free_page_fields(pg);
> > +
> > +    spin_unlock(&heap_lock);
> > +
> > +    if ( test_and_clear_bit(_PGC_need_scrub, &pg->count_info) && scrub )
>
> ... this, can't the above be simplified?

I tried to replicate what happens in alloc_heap_pages() where:

>  /* Preserve PGC_need_scrub so we can check it after lock is dropped. */
>  pg[i].count_info = PGC_state_inuse | (pg[i].count_info & PGC_need_scrub);

and then after the unlock the bit is tested.

> > +        scrub_one_page(pg);
> > +    else if ( scrub )
> > +        check_one_page(pg);
> > +
> > +    if ( need_tlbflush )
> > +        filtered_flush_tlb_mask(tlbflush_timestamp);
> > +
> > +    flush_page_to_ram(mfn_x(page_to_mfn(pg)),
> > +                      !(memflags & MEMF_no_icache_flush));
> > +
> > +    return pg;
> > +}
> > +
> > +static void __init init_color_heap_pages(struct page_info *pg,
> > +                                         unsigned long nr_pages)
> > +{
> > +    unsigned int i;
> > +    bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
> > +
> > +    if ( buddy_alloc_size )
> > +    {
> > +        unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), 
> > nr_pages);
> > +
> > +        init_heap_pages(pg, buddy_pages);
> > +        nr_pages -= buddy_pages;
> > +        buddy_alloc_size -= buddy_pages << PAGE_SHIFT;
> > +        pg += buddy_pages;
> > +    }
> > +
> > +    if ( !_color_heap )
> > +    {
> > +        unsigned int max_nr_colors = get_max_nr_llc_colors();
> > +
> > +        _color_heap = xmalloc_array(struct page_list_head, max_nr_colors);
> > +        free_colored_pages = xzalloc_array(unsigned long, max_nr_colors);
> > +        if ( !_color_heap || !free_colored_pages )
> > +            panic("Can't allocate colored heap. Buddy reserved size is too 
> > low");
> > +
> > +        for ( i = 0; i < max_nr_colors; i++ )
> > +            INIT_PAGE_LIST_HEAD(color_heap(i));
> > +    }
> > +
> > +    if ( nr_pages )
> > +        printk(XENLOG_DEBUG
> > +               "Init color heap with %lu pages, start MFN: %"PRI_mfn"\n",
> > +               nr_pages, mfn_x(page_to_mfn(pg)));
>
> This message can be issued more than once. Is that really desirable /
> necessary?

No I can drop it.

> > @@ -2458,7 +2626,14 @@ struct page_info *alloc_domheap_pages(
> >      if ( memflags & MEMF_no_owner )
> >          memflags |= MEMF_no_refcount;
> >
> > -    if ( !dma_bitsize )
> > +    /* Only domains are supported for coloring */
> > +    if ( d && llc_coloring_enabled )
> > +    {
> > +        /* Colored allocation must be done on 0 order */
> > +        if ( order || (pg = alloc_color_heap_page(memflags, d)) == NULL )
> > +            return NULL;
> > +    }
>
> I think I had asked before: What about MEMF_node() or MEMF_bits()
> having been used by the caller?

MEMF_node() is used for NUMA, right? I think that for the moment, since cache
coloring is supported only on arm64 and NUMA is not yet supported, it's safe
to ignore it.

I'm not sure I understood what MEMF_bits() is for. It restricts the allocator
to return pages in some range, right?

Thanks.

> Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.