[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/9] mm: Scrub memory from idle loop



>>> On 04.05.17 at 19:09, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 05/04/2017 11:31 AM, Jan Beulich wrote:
>>>>> On 14.04.17 at 17:37, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>> --- a/xen/common/page_alloc.c
>>> +++ b/xen/common/page_alloc.c
>>> @@ -1035,16 +1035,82 @@ merge_and_free_buddy(struct page_info *pg, unsigned 
>>> int node,
>>>      return pg;
>>>  }
>>>  
>>> -static void scrub_free_pages(unsigned int node)
>>> +static nodemask_t node_scrubbing;
>>> +
>>> +static unsigned int node_to_scrub(bool get_node)
>>> +{
>>> +    nodeid_t node = cpu_to_node(smp_processor_id()), local_node;
>>> +    nodeid_t closest = NUMA_NO_NODE;
>>> +    u8 dist, shortest = 0xff;
>>> +
>>> +    if ( node == NUMA_NO_NODE )
>>> +        node = 0;
>>> +
>>> +    if ( node_need_scrub[node] &&
>>> +         (!get_node || !node_test_and_set(node, node_scrubbing)) )
>>> +        return node;
>>> +
>>> +    /*
>>> +     * See if there are memory-only nodes that need scrubbing and choose
>>> +     * the closest one.
>>> +     */
>>> +    local_node = node;
>>> +    while ( 1 )
>>> +    {
>>> +        do {
>>> +            node = cycle_node(node, node_online_map);
>>> +        } while ( !cpumask_empty(&node_to_cpumask(node)) &&
>>> +                  (node != local_node) );
>>> +
>>> +        if ( node == local_node )
>>> +            break;
>>> +
>>> +        if ( node_need_scrub[node] )
>>> +        {
>>> +            if ( !get_node )
>>> +                return node;
>> I think the function parameter name is not / no longer suitable. The
>> caller wants to get _some_ node in either case. The difference is
>> whether it wants to just know whether there's _any_ needing scrub
>> work done, or whether it wants _the one_ to actually scrub on. So
>> how about "get_any" or "get_any_node" or just "any"?
> 
> Not only to find out whether there is anything to scrub but, if get_node
> is true, to actually "get" it, i.e. set the bit in the node_scrubbing
> mask. Thus the name.

Hmm, okay, in that case at least an explanatory comment should be
added.

>>> +bool scrub_free_pages(void)
>>>  {
>>>      struct page_info *pg;
>>>      unsigned int zone, order;
>>>      unsigned long i;
>>> +    unsigned int cpu = smp_processor_id();
>>> +    bool preempt = false;
>>> +    nodeid_t node;
>>>  
>>> -    ASSERT(spin_is_locked(&heap_lock));
>>> +    /*
>>> +     * Don't scrub while dom0 is being constructed since we may
>>> +     * fail trying to call map_domain_page() from scrub_one_page().
>>> +     */
>>> +    if ( system_state < SYS_STATE_active )
>>> +        return false;
>> I assume that's because of the mapcache vcpu override? That's x86
>> specific though, so the restriction here ought to be arch specific.
>> Even better would be to find a way to avoid this restriction
>> altogether, as on bigger systems only one CPU is actually busy
>> while building Dom0, so all others could be happily scrubbing. Could
>> that override become a per-CPU one perhaps?
> 
> Is it worth doing though? What you are saying below is exactly why I
> simply return here --- there were very few dirty pages.

Well, in that case the comment should cover this second reason as
well, at the very least.

> This may change
> if we decide to use idle-loop scrubbing for boot scrubbing as well (as
> Andrew suggested earlier) but there is little reason to do it now IMO.

Why not? In fact I had meant to ask why your series doesn't
include that?

>> Otoh there's not much to scrub yet until Dom0 had all its memory
>> allocated, and we know which pages truly remain free (wanting
>> what is currently the boot time scrubbing done on them). But that
>> point in time may still be earlier than when we switch to
>> SYS_STATE_active.

IOW I think boot scrubbing could be kicked off as soon as Dom0
had the bulk of its memory allocated.

>>> @@ -1065,16 +1131,29 @@ static void scrub_free_pages(unsigned int node)
>>>                          pg[i].count_info &= ~PGC_need_scrub;
>>>                          node_need_scrub[node]--;
>>>                      }
>>> +                    if ( softirq_pending(cpu) )
>>> +                    {
>>> +                        preempt = true;
>>> +                        break;
>>> +                    }
>> Isn't this a little too eager, especially if you didn't have to scrub
>> the page on this iteration?
> 
> What would be a good place then? Count how actually scrubbed pages and
> check for pending interrupts every so many?

Yes.

> Even if we don't scrub at all walking whole heap can take a while.

Correct - you can't skip this check altogether even if no page
requires actual scrubbing.

>>> @@ -1141,9 +1220,6 @@ static void free_heap_pages(
>>>      if ( tainted )
>>>          reserve_offlined_page(pg);
>>>  
>>> -    if ( need_scrub )
>>> -        scrub_free_pages(node);
>> I'd expect this eliminates the need for the need_scrub variable.
> 
> We still need it to decide whether to set PGC_need_scrub on pages.

Oh, of course.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.