[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] xen: use idle vcpus to scrub pages



On 07/02/2014 02:27 PM, Bob Liu wrote:
> 
> On 07/01/2014 08:59 PM, Jan Beulich wrote:
>>>>> On 01.07.14 at 14:25, <bob.liu@xxxxxxxxxx> wrote:
>>> On 07/01/2014 05:12 PM, Jan Beulich wrote:
>>>>>>> On 30.06.14 at 15:39, <lliubbo@xxxxxxxxx> wrote:
>>>>> @@ -948,6 +954,7 @@ static void free_heap_pages(
>>>>>      {
>>>>>          if ( !tainted )
>>>>>          {
>>>>> +            node_need_scrub[node] = 1;
>>>>>              for ( i = 0; i < (1 << order); i++ )
>>>>>                  pg[i].count_info |= PGC_need_scrub;
>>>>>          }
>>>>
>>>> Iirc it was more than this single place where you set
>>>> PGC_need_scrub, and hence where you'd now need to set the
>>>> other flag too.
>>>>
>>>
>>> I'm afraid this is the only place where PGC_need_scrub was set.
>>
>> Ah, indeed - I misremembered others, they are all tests for the flag.
>>
>>> I'm sorry for all of the coding style problems.
>>>
>>> By the way is there any script which can be used to check the code
>>> before submitting? Something like ./scripts/checkpatch.pl under linux.
>>
>> No, there isn't. But avoiding (or spotting) hard tabs should be easy
>> enough, and other things you ought to simply inspect your patch for
>> - after all that's no different from what reviewers do.
>>
>>>>> +    }
>>>>> +
>>>>> +    /* free percpu free list */
>>>>> +    if ( !page_list_empty(local_free_list) )
>>>>> +    {
>>>>> +        spin_lock(&heap_lock);
>>>>> +        page_list_for_each_safe( pg, tmp, local_free_list )
>>>>> +        {
>>>>> +            order = PFN_ORDER(pg);
>>>>> +            page_list_del(pg, local_free_list);
>>>>> +            for ( i = 0; i < (1 << order); i++ )
>>>>> +     {
>>>>> +                pg[i].count_info |= PGC_state_free;
>>>>> +                pg[i].count_info &= ~PGC_need_scrub;
>>>>
>>>> This needs to happen earlier - the scrub flag should be cleared right
>>>> after scrubbing, and the free flag should imo be set when the page
>>>> gets freed. That's for two reasons:
>>>> 1) Hypervisor allocations don't need scrubbed pages, i.e. they can
>>>> allocate memory regardless of the scrub flag's state.
>>>
>>> AFAIR, the reason I set those flags here is to avoid a panic happen.
>>
>> That's pretty vague a statement.
>>
>>>> 2) You still detain the memory on the local lists from allocation. On a
>>>> many-node system, the 16Mb per node can certainly sum up (which
>>>> is not to say that I don't view the 16Mb on a single node as already
>>>> problematic).
>>>
>>> Right, but we can adjust SCRUB_BATCH_ORDER.
>>> Anyway I'll take a retry as you suggested.
>>
>> You should really drop the idea of removing pages temporarily.
>> All you need to do is make sure a page being allocated and getting
>> simultaneously scrubbed by another CPU won't get passed to the
>> caller until the scrubbing finished. In particular it's no problem if
>> the allocating CPU occasionally ends up scrubbing a page already
>> being scrubbed elsewhere.
>>
> 
> Yes, I also like to drop percpu lists which can make things simper. But
> I'm afraid which also means I can't use any spinlock(&heap_lock) any
> more because of potential heavy lock contentions. I'm not sure whether
> things can work fine without heap_lock.
> 

In my attempt to get rid of heap_lock, there was a panic happen when
iterating the heap free list. My implementation is like this:
scrub_free_pages()
{
    for ( zone = 0; zone < NR_ZONES; zone++ )
    {
        for ( order = MAX_ORDER; order >= 0; order-- )
        {
            page_list_for_each_safe( pg, tmp, &heap(node, zone, order) )
            {
                if ( !test_bit(_PGC_need_scrub, &(pg->count_info)) )
                    continue;
                for ( i = 0; i < (1 << order); i++ )
                {
                    if ( test_bit(_PGC_need_scrub, &(pg->count_info)) )
                    {
                        scrub_one_page(&pg[i]);
                        clear_bit(_PGC_need_scrub, &pg[i].count_info);
                    }
                }
            }
        }
    }
}

The panic was in page_list_next().

I didn't find a good way to iterate the free list without holding
heap_lock, but if holding the lock it might be heavy lock contention
then I have to remove pages temporarily from heap free list to a percpu
list.

-- 
Regards,
-Bob

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.