Xen project Mailing List

Re: [Xen-devel] [PATCH 2/2] xen: spread page scrubbing across all idle CPU

Date: Wed, 11 Jun 2014 10:51:36 +0800

Cc: Bob Liu <lliubbo@xxxxxxxxx>, keir@xxxxxxx, ian.campbell@xxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 11 Jun 2014 02:52:49 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 06/10/2014 10:12 PM, Jan Beulich wrote: >>>> On 10.06.14 at 14:18, <lliubbo@xxxxxxxxx> wrote: >> --- a/xen/common/domain.c >> +++ b/xen/common/domain.c >> @@ -593,6 +593,10 @@ int domain_kill(struct domain *d) >> BUG_ON(rc != -EAGAIN); >> break; >> } >> + >> + tasklet_init(&global_scrub_tasklet, scrub_free_pages, 1); >> + tasklet_schedule_on_cpu(&global_scrub_tasklet, smp_processor_id()); > > Apart from this being a single global that you can't validly re-use > this way (as already pointed out by Andrew), it also remains > unclear why you want this scheduled on the current CPU rather > than just anywhere. > The reason is the same as you noticed later, if also schedule this tasklet on other CPUs then their respective vCPU won't get run at all until the scrubbing is done. So I use idle_loop() instead of tasklet on other CPUs. >> --- a/xen/common/page_alloc.c >> +++ b/xen/common/page_alloc.c >> @@ -79,6 +79,14 @@ PAGE_LIST_HEAD(page_offlined_list); >> /* Broken page list, protected by heap_lock. */ >> PAGE_LIST_HEAD(page_broken_list); >> >> +/* Global scrub page list, protected by scrub_lock */ >> +PAGE_LIST_HEAD(global_scrub_list); >> +DEFINE_SPINLOCK(scrub_lock); >> +struct tasklet global_scrub_tasklet; >> + >> +DEFINE_PER_CPU(struct page_list_head, scrub_list); >> +DEFINE_PER_CPU(struct page_list_head, scrubbed_list); > > All static I suppose? > >> @@ -1680,6 +1693,62 @@ void scrub_one_page(struct page_info *pg) >> unmap_domain_page(p); >> } >> >> +#define SCRUB_BATCH 1024 >> +void scrub_free_pages(unsigned long is_tasklet) >> +{ >> + struct page_info *pg; >> + unsigned int i, cpu, empty = 0; >> + >> + cpu = smp_processor_id(); >> + do >> + { >> + if ( page_list_empty(&this_cpu(scrub_list)) ) >> + { >> + /* Delist a batch of pages from global scrub list */ >> + spin_lock(&scrub_lock); >> + for( i = 0; i < SCRUB_BATCH; i++ ) >> + { >> + pg = page_list_remove_head(&global_scrub_list); >> + if ( !pg ) >> + { >> + empty = 1; >> + break; >> + } >> + page_list_add_tail(pg, &this_cpu(scrub_list)); >> + } >> + spin_unlock(&scrub_lock); >> + } >> + >> + /* Scrub percpu list */ >> + while ( !page_list_empty(&this_cpu(scrub_list)) ) >> + { >> + pg = page_list_remove_head(&this_cpu(scrub_list)); >> + ASSERT(pg); >> + scrub_one_page(pg); >> + page_list_add_tail(pg, &this_cpu(scrubbed_list)); >> + } > > So you scrub up to 1k pages at a time here - how long does that take? > Did you take into consideration how much higher a wakeup latency this > may cause when the invocation comes from the idle vCPU? > An extra softirq_pending(cpu) check can be added in this while loop, I think it's no more a problem if scrubbing only 1 page every time. >> + /* Free percpu scrubbed list */ >> + if ( !page_list_empty(&this_cpu(scrubbed_list)) ) >> + { >> + spin_lock(&heap_lock); >> + while ( !page_list_empty(&this_cpu(scrubbed_list)) ) >> + { >> + pg = page_list_remove_head(&this_cpu(scrubbed_list)); >> + __free_heap_pages(pg, 0); >> + } >> + spin_unlock(&heap_lock); >> + } >> + >> + /* Global list is empty */ >> + if (empty) >> + return; >> + } while ( !softirq_pending(cpu) ); >> + >> + if( is_tasklet ) >> + tasklet_schedule_on_cpu(&global_scrub_tasklet, cpu); > > So you re-schedule this tasklet immediately - while this may be > acceptable inside the hypervisor, did you consider the effect this > will have on the guest (normally Dom0)? Its respective vCPU won't > get run _at all_ until you're done scrubbing. > Yes, that's a problem. I don't have any better idea right now. What I'm trying is doing the scrubbing on current CPU as well as on all idle vcpus in parallel. I also considered your suggestion about doing the scrubbing in the background as well as on the allocation path. But I think it's more unacceptable for users to get blocked randomly for a uncertain time when allocating a large mount of memory. That's why I still chose the sync way that once 'xl destroy' return all memory are scrubbed. > In the end, this being an improvement by about 50% according to > the numbers you gave, I'm not convinced the downsides of this are > worth the gain. > > Jan > -- Regards, -Bob _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.