[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 07/11] xen: add cache coloring allocator for domains
On 27.01.2023 11:17, Carlo Nonato wrote: > On Thu, Jan 26, 2023 at 5:29 PM Jan Beulich <jbeulich@xxxxxxxx> wrote: >> On 23.01.2023 16:47, Carlo Nonato wrote: >>> --- a/xen/arch/arm/include/asm/mm.h >>> +++ b/xen/arch/arm/include/asm/mm.h >>> @@ -128,6 +128,9 @@ struct page_info >>> #else >>> #define PGC_static 0 >>> #endif >>> +/* Page is cache colored */ >>> +#define _PGC_colored PG_shift(4) >>> +#define PGC_colored PG_mask(1, 4) >> >> Is there a reason you don't follow the conditional approach we've taken >> for PGC_static? > > Nope, fixed that. And btw, if at all possible please avoid the #else part. page_alloc.c already deals with that case, so it would be needed only if you have a use of this constant somewhere else. >>> @@ -1488,7 +1497,7 @@ static void free_heap_pages( >>> /* Merge with predecessor block? */ >>> if ( !mfn_valid(page_to_mfn(predecessor)) || >>> !page_state_is(predecessor, free) || >>> - (predecessor->count_info & PGC_static) || >>> + (predecessor->count_info & (PGC_static | PGC_colored)) || >>> (PFN_ORDER(predecessor) != order) || >>> (page_to_nid(predecessor) != node) ) >>> break; >>> @@ -1512,7 +1521,7 @@ static void free_heap_pages( >>> /* Merge with successor block? */ >>> if ( !mfn_valid(page_to_mfn(successor)) || >>> !page_state_is(successor, free) || >>> - (successor->count_info & PGC_static) || >>> + (successor->count_info & (PGC_static | PGC_colored)) || >>> (PFN_ORDER(successor) != order) || >>> (page_to_nid(successor) != node) ) >>> break; >> >> This, especially without being mentioned in the description (only in the >> revision log), could likely also be split out (and then also be properly >> justified). > > You mean to make it a prereq patch or putting it after this patch? A prereq, for ... > In the second case it would be like a fix for the previous patch, something > that is to be avoided, right? ... precisely this reason. Elsewhere you make a constant from PGC_extra | PGC_static | PGC_colored. Maybe that could become a file scope one, which you could then use here as well. Then the change wouldn't even depend on you already having introduced PGC_colored (but otherwise just having the stub #define earlier in the file would suffice as well for this to compile fine independent of the bulk of the changes). (FTAOD PGC_extra would be meaningless / benign here, for only ever being set on allocated pages.) >>> +static void free_color_heap_page(struct page_info *pg) >>> +{ >>> + unsigned int color = page_to_llc_color(pg), nr_colors = >>> get_nr_llc_colors(); >>> + unsigned long pdx = page_to_pdx(pg); >>> + colored_pages_t *head = color_heap(color); >>> + struct page_info *prev = pdx >= nr_colors ? pg - nr_colors : NULL; >>> + struct page_info *next = pdx + nr_colors < FRAMETABLE_NR ? pg + >>> nr_colors >>> + : NULL; >> >> Are these two calculations safe? At least on x86 parts of frame_table[] may >> not be populated, so de-referencing prev and/or next might fault. > > I have to admit I've not taken this into consideration. I'll check that. > >>> + spin_lock(&heap_lock); >>> + >>> + if ( is_free_colored_page(prev) ) >>> + next = page_list_next(prev, head); >>> + else if ( !is_free_colored_page(next) ) >>> + { >>> + /* >>> + * FIXME: linear search is slow, but also note that the frametable >>> is >>> + * used to find free pages in the immediate neighborhood of pg in >>> + * constant time. When freeing contiguous pages, the insert >>> position of >>> + * most of them is found without the linear search. >>> + */ >>> + page_list_for_each( next, head ) >>> + { >>> + if ( page_to_maddr(next) > page_to_maddr(pg) ) >>> + break; >>> + } >>> + } >>> + >>> + mark_page_free(pg, page_to_mfn(pg)); >>> + pg->count_info |= PGC_colored; >>> + free_colored_pages[color]++; >>> + page_list_add_prev(pg, next, head); >>> + >>> + spin_unlock(&heap_lock); >>> +} >> >> There's no scrubbing here at all, and no mention of the lack thereof in >> the description. > > This comes from the very first version (which I'm not an author of). > Can you explain to me briefly what is it and if I need it? Or can you give > me pointers? Did you look for occurrences of the word "scrub" elsewhere in the file? You want to avoid pages holding information from one guest to become visible unchanged to another one. >>> +static void __init init_color_heap_pages(struct page_info *pg, >>> + unsigned long nr_pages) >>> +{ >>> + unsigned int i; >>> + >>> + if ( buddy_alloc_size ) >>> + { >>> + unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), >>> nr_pages); >>> + >>> + init_heap_pages(pg, buddy_pages); >>> + nr_pages -= buddy_pages; >>> + buddy_alloc_size -= buddy_pages << PAGE_SHIFT; >>> + pg += buddy_pages; >>> + } >> >> I think you want to bail here if nr_pages is now zero, not the least to avoid >> crashing ... >> >>> + if ( !_color_heap ) >>> + { >>> + unsigned int nr_colors = get_nr_llc_colors(); >>> + >>> + _color_heap = xmalloc_array(colored_pages_t, nr_colors); >>> + BUG_ON(!_color_heap); >>> + free_colored_pages = xzalloc_array(unsigned long, nr_colors); >>> + BUG_ON(!free_colored_pages); >> >> ... here in case the amount that was freed was really tiny. > > Here the array is allocated with nr_colors not nr_pages. nr_colors can't be 0. > And if nr_pages is 0 it just means that all pages went to the buddy. I see no > crash in this case. Aren't the two BUG_ON() not very clear crash causes? My comment wasn't about nr_pages == 0 being an issue further down (I think I had convinced myself that this case was handled correctly), but about the buddy allocator still having too little memory to satisfy the two allocations here (which also you don't need yet if there's no memory to go to the colored allocator in the first place). >>> --- a/xen/include/xen/mm.h >>> +++ b/xen/include/xen/mm.h >>> @@ -299,6 +299,33 @@ page_list_add_tail(struct page_info *page, struct >>> page_list_head *head) >>> } >>> head->tail = page; >>> } >>> +static inline void >>> +_page_list_add(struct page_info *page, struct page_info *prev, >>> + struct page_info *next) >>> +{ >>> + page->list.prev = page_to_pdx(prev); >>> + page->list.next = page_to_pdx(next); >>> + prev->list.next = page_to_pdx(page); >>> + next->list.prev = page_to_pdx(page); >>> +} >>> +static inline void >>> +page_list_add_prev(struct page_info *page, struct page_info *next, >>> + struct page_list_head *head) >>> +{ >>> + struct page_info *prev; >>> + >>> + if ( !next ) >>> + { >>> + page_list_add_tail(page, head); >>> + return; >>> + } >> >> !next is ambiguous in its meaning, so a comment towards the intended >> behavior here would be helpful. It could be that the tail insertion is >> necessary behavior, but it also could be that insertion anywhere would >> actually be okay, and tail insertion is merely the variant you ended up >> picking. > > This makes it possible to call the function when next has been used to iterate > over a list. If there's no next we are at the end of the list, so add to the > tail. The other way is to handle the case outside this function. > >> Then again ... >> >>> + prev = page_list_prev(next, head); >>> + if ( !prev ) >>> + page_list_add(page, head); >>> + else >>> + _page_list_add(page, prev, next); >>> +} >>> static inline bool_t >>> __page_list_del_head(struct page_info *page, struct page_list_head *head, >>> struct page_info *next, struct page_info *prev) >>> @@ -451,6 +478,12 @@ page_list_add_tail(struct page_info *page, struct >>> page_list_head *head) >>> list_add_tail(&page->list, head); >>> } >>> static inline void >>> +page_list_add_prev(struct page_info *page, struct page_info *next, >>> + struct page_list_head *head) >>> +{ >>> + list_add_tail(&page->list, &next->list); >> >> ... you don't care about !next here at all? > > I did it this way since *page is never checked for NULL as well. Also, this > other type of list isn't NULL terminated. I see. In which case properly explaining the intended behavior and use case in the earlier function should hopefully eliminate the need for anything special here. I have to admit though that I consider this a little fragile for a seemingly generic helper function; I wonder whether the special case wouldn't better be handled in the caller instead. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |