[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] xen: spread page scrubbing across all idle CPU



On 06/10/2014 10:12 PM, Jan Beulich wrote:
>>>> On 10.06.14 at 14:18, <lliubbo@xxxxxxxxx> wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -593,6 +593,10 @@ int domain_kill(struct domain *d)
>>              BUG_ON(rc != -EAGAIN);
>>              break;
>>          }
>> +
>> +        tasklet_init(&global_scrub_tasklet, scrub_free_pages, 1);
>> +        tasklet_schedule_on_cpu(&global_scrub_tasklet, smp_processor_id());
> 
> Apart from this being a single global that you can't validly re-use
> this way (as already pointed out by Andrew), it also remains
> unclear why you want this scheduled on the current CPU rather
> than just anywhere.
> 

The reason is the same as you noticed later, if also schedule this
tasklet on other CPUs then their respective vCPU won't get run at all
until the scrubbing is done.

So I use idle_loop() instead of tasklet on other CPUs.

>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -79,6 +79,14 @@ PAGE_LIST_HEAD(page_offlined_list);
>>  /* Broken page list, protected by heap_lock. */
>>  PAGE_LIST_HEAD(page_broken_list);
>>  
>> +/* Global scrub page list, protected by scrub_lock */
>> +PAGE_LIST_HEAD(global_scrub_list);
>> +DEFINE_SPINLOCK(scrub_lock);
>> +struct tasklet global_scrub_tasklet;
>> +
>> +DEFINE_PER_CPU(struct page_list_head, scrub_list);
>> +DEFINE_PER_CPU(struct page_list_head, scrubbed_list);
> 
> All static I suppose?
> 
>> @@ -1680,6 +1693,62 @@ void scrub_one_page(struct page_info *pg)
>>      unmap_domain_page(p);
>>  }
>>  
>> +#define SCRUB_BATCH 1024
>> +void scrub_free_pages(unsigned long is_tasklet)
>> +{
>> +    struct page_info *pg;
>> +    unsigned int i, cpu, empty = 0;
>> +
>> +    cpu = smp_processor_id();
>> +    do
>> +    {
>> +        if ( page_list_empty(&this_cpu(scrub_list)) )
>> +        {
>> +            /* Delist a batch of pages from global scrub list */
>> +            spin_lock(&scrub_lock);
>> +            for( i = 0; i < SCRUB_BATCH; i++ )
>> +            {
>> +                pg = page_list_remove_head(&global_scrub_list);
>> +                if ( !pg )
>> +                {
>> +                    empty = 1;
>> +                    break;
>> +                }
>> +                page_list_add_tail(pg, &this_cpu(scrub_list));
>> +             }
>> +             spin_unlock(&scrub_lock);
>> +        }
>> +
>> +        /* Scrub percpu list */
>> +        while ( !page_list_empty(&this_cpu(scrub_list)) )
>> +        {
>> +            pg = page_list_remove_head(&this_cpu(scrub_list));
>> +            ASSERT(pg);
>> +            scrub_one_page(pg);
>> +            page_list_add_tail(pg, &this_cpu(scrubbed_list));
>> +        }
> 
> So you scrub up to 1k pages at a time here - how long does that take?
> Did you take into consideration how much higher a wakeup latency this
> may cause when the invocation comes from the idle vCPU?
> 

An extra softirq_pending(cpu) check can be added in this while loop, I
think it's no more a problem if scrubbing only 1 page every time.

>> +        /* Free percpu scrubbed list */
>> +        if ( !page_list_empty(&this_cpu(scrubbed_list)) )
>> +        {
>> +            spin_lock(&heap_lock);
>> +            while ( !page_list_empty(&this_cpu(scrubbed_list)) )
>> +            {
>> +                pg = page_list_remove_head(&this_cpu(scrubbed_list));
>> +                __free_heap_pages(pg, 0);
>> +            }
>> +            spin_unlock(&heap_lock);
>> +        }
>> +
>> +        /* Global list is empty */
>> +        if (empty)
>> +            return;
>> +    } while ( !softirq_pending(cpu) );
>> +
>> +    if( is_tasklet )
>> +        tasklet_schedule_on_cpu(&global_scrub_tasklet, cpu);
> 
> So you re-schedule this tasklet immediately - while this may be
> acceptable inside the hypervisor, did you consider the effect this
> will have on the guest (normally Dom0)? Its respective vCPU won't
> get run _at all_ until you're done scrubbing.
> 

Yes, that's a problem. I don't have any better idea right now.

What I'm trying is doing the scrubbing on current CPU as well as on all
idle vcpus in parallel.
I also considered your suggestion about doing the scrubbing in the
background as well as on the allocation path. But I think it's more
unacceptable for users to get blocked randomly for a uncertain time when
allocating a large mount of memory.
That's why I still chose the sync way that once 'xl destroy' return all
memory are scrubbed.

> In the end, this being an improvement by about 50% according to
> the numbers you gave, I'm not convinced the downsides of this are
> worth the gain.
> 
> Jan
> 

-- 
Regards,
-Bob

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.