[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 10/15] xen: add cache coloring allocator for domains



Hi Jan,

On Mon, Feb 5, 2024 at 1:38 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 05.02.2024 12:59, Carlo Nonato wrote:
> > On Thu, Feb 1, 2024 at 4:53 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >> On 29.01.2024 18:18, Carlo Nonato wrote:
> >>> @@ -157,7 +158,11 @@
> >>>  #define PGC_static 0
> >>>  #endif
> >>>
> >>> -#define PGC_preserved (PGC_extra | PGC_static)
> >>> +#ifndef PGC_colored
> >>> +#define PGC_colored 0
> >>> +#endif
> >>> +
> >>> +#define PGC_preserved (PGC_extra | PGC_static | PGC_colored)
> >>>
> >>>  #ifndef PGT_TYPE_INFO_INITIALIZER
> >>>  #define PGT_TYPE_INFO_INITIALIZER 0
> >>> @@ -1504,6 +1509,7 @@ static void free_heap_pages(
> >>>              if ( !mfn_valid(page_to_mfn(predecessor)) ||
> >>>                   !page_state_is(predecessor, free) ||
> >>>                   (predecessor->count_info & PGC_static) ||
> >>> +                 (predecessor->count_info & PGC_colored) ||
> >>
> >> How about a 2nd #define (e.g. PGC_no_buddy_merge) to use here and ...
> >>
> >>>                   (PFN_ORDER(predecessor) != order) ||
> >>>                   (page_to_nid(predecessor) != node) )
> >>>                  break;
> >>> @@ -1528,6 +1534,7 @@ static void free_heap_pages(
> >>>              if ( !mfn_valid(page_to_mfn(successor)) ||
> >>>                   !page_state_is(successor, free) ||
> >>>                   (successor->count_info & PGC_static) ||
> >>> +                 (successor->count_info & PGC_colored) ||
> >>
> >> ... here? That'll then also avoid me commenting that I don't see why
> >> these two PGC_* checks aren't folded.
> >
> > Yes for me it's ok (I even suggested that in v5). Do you want this change to
> > be merged with the previous patch? Or should they all belong to this one?
>
> Or make yet another small prereq patch?

Ok.

> >>> +static void free_color_heap_page(struct page_info *pg, bool need_scrub)
> >>> +{
> >>> +    unsigned int color = page_to_llc_color(pg);
> >>> +    struct page_list_head *head = color_heap(color);
> >>> +
> >>> +    spin_lock(&heap_lock);
> >>> +
> >>> +    mark_page_free(pg, page_to_mfn(pg));
> >>> +
> >>> +    if ( need_scrub )
> >>> +    {
> >>> +        pg->count_info |= PGC_need_scrub;
> >>> +        poison_one_page(pg);
> >>> +    }
> >>> +
> >>> +    pg->count_info |= PGC_colored;
> >>
> >> The page transiently losing PGC_colored is not an issue then (presumably
> >> because of holding the heap lock)? Did you consider having mark_page_free()
> >> also use PGC_preserved?
> >
> > I did something similar to what it's done in unprepare_staticmem_pages():
> > first mark_page_free() is called and then PGC_static is added to count_info.
>
> I had guessed this would have been served as reference, yet by saying what
> you did you don't really answer my question. (Really I'm not entirely sure
> the similar staticmem behavior is entirely correct.)

I think holding the heap lock allows writing pg->count_info. Anyway, yes, I
can use PGC_preserved in mark_page_free(), but then pages must be correctly
initialized with PGC_colored elsewhere.

> >>> +    if ( !pg )
> >>> +    {
> >>> +        spin_unlock(&heap_lock);
> >>> +        return NULL;
> >>> +    }
> >>> +
> >>> +    pg->count_info = PGC_state_inuse | PGC_colored |
> >>> +                     (pg->count_info & PGC_need_scrub);
> >>
> >> Isn't PGC_colored already set on the page?
> >
> > Yes but here I need to make sure that only PGC_state_inuse | PGC_colored are
> > used so an assignment seems legit.
>
> Well, yes, you won't be able to avoid an assignment, but couldn't you
> preserve PGC_colored just like PGC_need_scrub is preserved? Or else
> at least assert the flag is set, to avoid giving the impression that
> right here pages can suddenly become "colored" ones? Then again them
> becoming so _may_ be needed during initialization.

Yeah, now I understood.

>
> >> Together with ...
> >>
> >>> +    free_colored_pages[color]--;
> >>> +    page_list_del(pg, color_heap(color));
> >>> +
> >>> +    if ( !(memflags & MEMF_no_tlbflush) )
> >>> +        accumulate_tlbflush(&need_tlbflush, pg, &tlbflush_timestamp);
> >>> +
> >>> +    init_free_page_fields(pg);
> >>> +
> >>> +    spin_unlock(&heap_lock);
> >>> +
> >>> +    if ( test_and_clear_bit(_PGC_need_scrub, &pg->count_info) && scrub )
> >>
> >> ... this, can't the above be simplified?
> >
> > I tried to replicate what happens in alloc_heap_pages() where:
> >
> >>  /* Preserve PGC_need_scrub so we can check it after lock is dropped. */
> >>  pg[i].count_info = PGC_state_inuse | (pg[i].count_info & PGC_need_scrub);
> >
> > and then after the unlock the bit is tested.
>
> Again I was indeed assuming that you used existing code as reference.
> Yet again my question was whether that's actually what is needed here
> (which then may or may not mean that it could be done differently
> also there).

Ok. Probably isn't needed since we are referring to a single page. So I can
first test for the bit, save the result in a variable and remove the
preservation, then later use the variable instead of test_and_clear_bit().

> >>> @@ -2458,7 +2626,14 @@ struct page_info *alloc_domheap_pages(
> >>>      if ( memflags & MEMF_no_owner )
> >>>          memflags |= MEMF_no_refcount;
> >>>
> >>> -    if ( !dma_bitsize )
> >>> +    /* Only domains are supported for coloring */
> >>> +    if ( d && llc_coloring_enabled )
> >>> +    {
> >>> +        /* Colored allocation must be done on 0 order */
> >>> +        if ( order || (pg = alloc_color_heap_page(memflags, d)) == NULL )
> >>> +            return NULL;
> >>> +    }
> >>
> >> I think I had asked before: What about MEMF_node() or MEMF_bits()
> >> having been used by the caller?
> >
> > MEMF_node() is used for NUMA, right? I think that for the moment, since 
> > cache
> > coloring is supported only on arm64 and NUMA is not yet supported, it's safe
> > to ignore it.
>
> NUMA or not, I'm wary of silent ignoring of inputs. I'd rather request
> that, just like with staticmem, precautions are then taken to avoid
> coloring and NUMA to be used together. Recall that much like your work,
> work to support NUMA is also in progress (unless it was canceled, but
> not so announced).

Ok.

> > I'm not sure I understood what MEMF_bits() is for. It restricts the 
> > allocator
> > to return pages in some range, right?
>
> Correct.

So I need to think about this, since I see no easy way of supporting such a
thing with the current colored allocator implementation.
Do you have any suggestions?

Thanks.

> Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.