[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 07/12] mm: allow page scrubbing routine(s) to be arch controlled
On 27.05.2021 15:06, Julien Grall wrote: > On 27/05/2021 13:33, Jan Beulich wrote: >> Especially when dealing with large amounts of memory, memset() may not >> be very efficient; this can be bad enough that even for debug builds a >> custom function is warranted. We additionally want to distinguish "hot" >> and "cold" cases. > > Do you have any benchmark showing the performance improvement? This is based on the numbers provided at https://lists.xen.org/archives/html/xen-devel/2021-04/msg00716.html (???) with the thread with some of the prior discussion rooted at https://lists.xen.org/archives/html/xen-devel/2021-04/msg00425.html I'm afraid I lack ideas on how to sensibly measure _all_ of the effects (i.e. including the amount of disturbing of caches). >> --- >> The choice between hot and cold in scrub_one_page()'s callers is >> certainly up for discussion / improvement. > > To get the discussion started, can you explain how you made the decision > between hot/cot? This will also want to be written down in the commit > message. Well, the initial trivial heuristic is "allocation for oneself" vs "allocation for someone else, or freeing, or scrubbing", i.e. whether it would be likely that the page will soon be accessed again (or for the first time). >> --- /dev/null >> +++ b/xen/arch/x86/scrub_page.S >> @@ -0,0 +1,41 @@ >> + .file __FILE__ >> + >> +#include <asm/asm_defns.h> >> +#include <xen/page-size.h> >> +#include <xen/scrub.h> >> + >> +ENTRY(scrub_page_cold) >> + mov $PAGE_SIZE/32, %ecx >> + mov $SCRUB_PATTERN, %rax >> + >> +0: movnti %rax, (%rdi) >> + movnti %rax, 8(%rdi) >> + movnti %rax, 16(%rdi) >> + movnti %rax, 24(%rdi) >> + add $32, %rdi >> + sub $1, %ecx >> + jnz 0b >> + >> + sfence >> + ret >> + .type scrub_page_cold, @function >> + .size scrub_page_cold, . - scrub_page_cold >> + >> + .macro scrub_page_stosb >> + mov $PAGE_SIZE, %ecx >> + mov $SCRUB_BYTE_PATTERN, %eax >> + rep stosb >> + ret >> + .endm >> + >> + .macro scrub_page_stosq >> + mov $PAGE_SIZE/8, %ecx >> + mov $SCRUB_PATTERN, %rax >> + rep stosq >> + ret >> + .endm >> + >> +ENTRY(scrub_page_hot) >> + ALTERNATIVE scrub_page_stosq, scrub_page_stosb, X86_FEATURE_ERMS >> + .type scrub_page_hot, @function >> + .size scrub_page_hot, . - scrub_page_hot > > From the commit message, it is not clear how the implementation for > hot/cold was chosen. Can you outline in the commit message what are the > assumption for each helper? I've added 'The goal is for accesses of "cold" pages to not disturb caches (albeit finding a good balance between this and the higher latency looks to be difficult).' >> @@ -1046,12 +1051,14 @@ static struct page_info *alloc_heap_page >> if ( first_dirty != INVALID_DIRTY_IDX || >> (scrub_debug && !(memflags & MEMF_no_scrub)) ) >> { >> + bool cold = d && d != current->domain; > > So the assumption is if the domain is not running, then the content is > not in the cache. Is that correct? Not exactly: For one, instead of "not running" it is "is not the current domain", i.e. there may still be vCPU-s of the domain running elsewhere. And for the cache the question isn't so much of "is in cache", but to avoid needlessly bringing contents into the cache when the data is unlikely to be used again soon. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |