[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 4/8] mm: Scrub memory from idle loop



>>> On 12.06.17 at 19:01, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 06/12/2017 04:08 AM, Jan Beulich wrote:
>>>>> On 19.05.17 at 17:50, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>> Instead of scrubbing pages during guest destruction (from
>>> free_heap_pages()) do this opportunistically, from the idle loop.
>> This is too brief for my taste. In particular the re-ordering ...
>>
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -118,8 +118,9 @@ static void idle_loop(void)
>>>      {
>>>          if ( cpu_is_offline(smp_processor_id()) )
>>>              play_dead();
>>> -        (*pm_idle)();
>>>          do_tasklet();
>>> +        if ( cpu_is_haltable(smp_processor_id()) && !scrub_free_pages() )
>>> +            (*pm_idle)();
>>>          do_softirq();
>> ... here (and its correctness / safety) needs an explanation. Not
>> processing tasklets right after idle wakeup is a not obviously
>> correct change. Nor is it immediately clear why this needs to be
>> pulled ahead for your purposes.
> 
> Are you asking for a comment here (i.e. the explanation given by Dario
> (added)  in
> https://lists.xenproject.org/archives/html/xen-devel/2017-05/msg01215.html)
> or are you saying something is wrong?

To judge whether the change is correct I'd like to have an
explanation in the commit message.

>>> +        if ( node_need_scrub[node] )
>>> +        {
>>> +            if ( !get_node )
>>> +                return node;
>>> +
>>> +            dist = __node_distance(local_node, node);
>>> +            if ( dist < shortest || closest == NUMA_NO_NODE )
>>> +            {
>>> +                if ( !node_test_and_set(node, node_scrubbing) )
>>> +                {
>>> +                    if ( closest != NUMA_NO_NODE )
>>> +                        node_clear(closest, node_scrubbing);
>> You call this function with no locks held, yet you temporarily set a
>> bit in node_scrubbing which you then may clear again here. That'll
>> prevent another idle CPU to do scrubbing on this node afaict,
>> which, while not a bug, is not optimal. Therefore I think for this
>> second part of the function you actually want to acquire the heap
>> lock.
> 
> I actually specifically didn't want to take the heap lock here since we
> will be calling this routine quite often and in most cases no scrubbing
> will be needed.

I'm not convinced - memory-only nodes shouldn't be that common,
so in a presumably large share of cases we don't even need to
enter this second part of the function. And if a debatable choice
is being made, giving the reason in a short comment would help
reviewers judge for themselves whether indeed the chosen
approach is the lesser of two (or more) possible evils.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.