[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/9] mm: Scrub memory from idle loop



>>> On 14.04.17 at 17:37, <boris.ostrovsky@xxxxxxxxxx> wrote:
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1035,16 +1035,82 @@ merge_and_free_buddy(struct page_info *pg, unsigned 
> int node,
>      return pg;
>  }
>  
> -static void scrub_free_pages(unsigned int node)
> +static nodemask_t node_scrubbing;
> +
> +static unsigned int node_to_scrub(bool get_node)
> +{
> +    nodeid_t node = cpu_to_node(smp_processor_id()), local_node;
> +    nodeid_t closest = NUMA_NO_NODE;
> +    u8 dist, shortest = 0xff;
> +
> +    if ( node == NUMA_NO_NODE )
> +        node = 0;
> +
> +    if ( node_need_scrub[node] &&
> +         (!get_node || !node_test_and_set(node, node_scrubbing)) )
> +        return node;
> +
> +    /*
> +     * See if there are memory-only nodes that need scrubbing and choose
> +     * the closest one.
> +     */
> +    local_node = node;
> +    while ( 1 )
> +    {
> +        do {
> +            node = cycle_node(node, node_online_map);
> +        } while ( !cpumask_empty(&node_to_cpumask(node)) &&
> +                  (node != local_node) );
> +
> +        if ( node == local_node )
> +            break;
> +
> +        if ( node_need_scrub[node] )
> +        {
> +            if ( !get_node )
> +                return node;

I think the function parameter name is not / no longer suitable. The
caller wants to get _some_ node in either case. The difference is
whether it wants to just know whether there's _any_ needing scrub
work done, or whether it wants _the one_ to actually scrub on. So
how about "get_any" or "get_any_node" or just "any"?

> +            if ( !node_test_and_set(node, node_scrubbing) )
> +            {
> +                dist = __node_distance(local_node, node);
> +                if ( (dist < shortest) || (dist == NUMA_NO_DISTANCE) )
> +                {
> +                    /* Release previous node. */
> +                    if ( closest != NUMA_NO_NODE )
> +                        node_clear(closest, node_scrubbing);
> +                    shortest = dist;
> +                    closest = node;
> +                }
> +                else
> +                    node_clear(node, node_scrubbing);
> +            }

Wouldn't it be better to check distance before setting the bit in
node_scrubbing, avoiding the possible need to clear it again later
(potentially misguiding other CPUs)?

And why would NUMA_NO_DISTANCE lead to a switch of nodes?
I can see that being needed when closest == NUMA_NO_NODE,
but once you've picked one I think you should switch only when
you've found another that's truly closer.

> +        }
> +    }
> +
> +    return closest;
> +}
> +
> +bool scrub_free_pages(void)
>  {
>      struct page_info *pg;
>      unsigned int zone, order;
>      unsigned long i;
> +    unsigned int cpu = smp_processor_id();
> +    bool preempt = false;
> +    nodeid_t node;
>  
> -    ASSERT(spin_is_locked(&heap_lock));
> +    /*
> +     * Don't scrub while dom0 is being constructed since we may
> +     * fail trying to call map_domain_page() from scrub_one_page().
> +     */
> +    if ( system_state < SYS_STATE_active )
> +        return false;

I assume that's because of the mapcache vcpu override? That's x86
specific though, so the restriction here ought to be arch specific.
Even better would be to find a way to avoid this restriction
altogether, as on bigger systems only one CPU is actually busy
while building Dom0, so all others could be happily scrubbing. Could
that override become a per-CPU one perhaps?

Otoh there's not much to scrub yet until Dom0 had all its memory
allocated, and we know which pages truly remain free (wanting
what is currently the boot time scrubbing done on them). But that
point in time may still be earlier than when we switch to
SYS_STATE_active.

> @@ -1065,16 +1131,29 @@ static void scrub_free_pages(unsigned int node)
>                          pg[i].count_info &= ~PGC_need_scrub;
>                          node_need_scrub[node]--;
>                      }
> +                    if ( softirq_pending(cpu) )
> +                    {
> +                        preempt = true;
> +                        break;
> +                    }

Isn't this a little too eager, especially if you didn't have to scrub
the page on this iteration?

> @@ -1141,9 +1220,6 @@ static void free_heap_pages(
>      if ( tainted )
>          reserve_offlined_page(pg);
>  
> -    if ( need_scrub )
> -        scrub_free_pages(node);

I'd expect this eliminates the need for the need_scrub variable.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.