[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] xen: use idle vcpus to scrub pages



>>> On 25.07.14 at 09:28, <bob.liu@xxxxxxxxxx> wrote:

> On 07/25/2014 02:51 PM, Jan Beulich wrote:
>>>>> On 25.07.14 at 02:42, <bob.liu@xxxxxxxxxx> wrote:
>> 
>>> On 07/24/2014 02:24 PM, Jan Beulich wrote:
>>>>>>> On 24.07.14 at 04:08, <bob.liu@xxxxxxxxxx> wrote:
>>>>
>>>>> On 07/23/2014 03:28 PM, Jan Beulich wrote:
>>>>>>>>> On 15.07.14 at 11:16, <bob.liu@xxxxxxxxxx> wrote:
>>>>>>> After so many days I haven't make a workable solution if don't remove
>>>>>>> pages temporarily. The hardest part is iterating the heap free list
>>>>>>> without holding heap_lock because if holding the lock it might be heavy
>>>>>>> lock contention.
>>>>>>> So do you think it's acceptable if fixed all other concerns about this
>>>>>>> patch?
>>>>>>
>>>>>> No, I don't think so. Instead I'm of the opinion that you may have
>>>>>> worked in the wrong direction: Rather than not taking the heap lock
>>>>>> at all, it may also be sufficient to shrink the lock holding time (i.e.
>>>>>> avoid long loops with the lock held).
>>>>>>
>>>>>
>>>>> But I still think have to drop pages from heap list temporarily else
>>>>> heap lock must be taken for a long time to get rid of E.g. below race
>>>>> condition.
>>>>>
>>>>> A: alloc path            B: idle loop
>>>>>
>>>>>                         spin_lock(&heap_lock)
>>>>>                       page_list_for_each( pg, &heap(node, zone, order) )
>>>>>                         if _PGC_need_scrub is set, break;
>>>>>                         spin_unlock(&heap_lock)
>>>>>
>>>>>                         if ( test_bit(_PGC_need_scrub, pg)
>>>>>
>>>>> ^^^^
>>>>> spin_lock(&heap_lock)
>>>>> delist page
>>>>> spin_unlock(&heap_lock)
>>>>>
>>>>> write data to this page
>>>>>
>>>>>                          scrub_one_page(pg)
>>>>>                          ^^^ will clean useful data
>>>>
>>>> No (and I'm sure I said so before): The only problem is with the
>>>> linked list itself; the page contents are not a problem - the
>>>> allocation path can simply wait for the already suggested
>>>> _PGC_scrubbing flag to clear before returning. And as already
>>>
>>> The page contents are a problem if the race condition I mentioned in
>>> previous email happen.
>>>
>>> Because there is a time window between checking the PGC_need_scrub flag
>>> and doing the real scrub in idle thread, the idle thread will still
>>> scrub a page after that page have been allocated by allocation path and
>>> been used(and have been written some useful data).
>> 
>> Did you really read all of my previous reply?
> 
> Sure, may be I misunderstood your reply.
> If the allocation path can wait for the flag there is no problem, but I
> remember you suggested to do the scrubbing also in allocation path in
> which case I think this race condition will happen.

Two CPUs scrubbing the same page simultaneously is not a
correctness problem, only a performance one: The ultimate result
is a scrubbed page, but there may be unnecessary bus contention.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.