[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 6/9] xen/common: add cache coloring allocator for domains


  • To: Carlo Nonato <carlo.nonato@xxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 15 Nov 2022 08:53:18 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8EXECTV9C6oDhySzkOcsb6G9UHw6CqFTXb5TGZTafo0=; b=GUE8YIBJHI6Cdo/PyGQSRTePj/zW/t8oS+KUMp6+o7AtzFmrPX6I4km/K+tU0D16dy7HBuqTVHef277u5lyTrQYHHopx3RdNb1FyiaHR5fuN/nKCzyJ5IiXMQC4X/UEjB7AnAoQ3bQLwXrIDl5MiIMbeNABa5gcb7TDTwU6nq4Dn51FLoEUESw6YJ9BXsTV4BQQFBEZ54OLX0yHxDD7onfemAoF5JwMwy2mC+hsbCCuJdHW1w6CQJ1ol5oUhvzG20c5dvjBAO/WOmTiIPD5ZzSxwEXVqY5pHwLzOj+rKKe69ItxMG2JTc98dzVMcw4NsqisVKuCCiuDxiFj/BLoG/g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Drh7SXiTG0azDADop78JrE1U+qkRnU45OvqAKAQXg+oZPYjQSv0OH3IKAgiMnxKu76rbgy+pmpJBPuaH3fhiahRobmxCGlPn5Ic/iFIIc7pCK4uk1i1daqlm+k176M4aYbs/Tj/T0ZAsYfeiurxb4tL4ICGGwIF/nMVPeMtIlzj+AccJRNChknrdhuji/otUe+BO7oJxzaFf/fBtGNXvUyAPLyxGJpJujLPowtl5nNgDxYey4YzMsj3/NxmhJscY+XUXSUspr3h3h2NaJqg4t4XjXnPxJT6x8SgyXKgqoBYM88MfxwU9kvjT0N5N1niQjf8zOG86P69TE3ZIwhRFRA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: marco.solieri@xxxxxxxxxx, andrea.bastoni@xxxxxxxxxxxxxxx, lucmiccio@xxxxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Marco Solieri <marco.solieri@xxxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 15 Nov 2022 07:54:09 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14.11.2022 16:04, Carlo Nonato wrote:
> On Thu, Nov 10, 2022 at 5:47 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>> On 22.10.2022 17:51, Carlo Nonato wrote:
>>> --- a/xen/arch/arm/p2m.c
>>> +++ b/xen/arch/arm/p2m.c
>>> @@ -661,7 +661,12 @@ static int p2m_create_table(struct p2m_domain *p2m, 
>>> lpae_t *entry)
>>>
>>>      ASSERT(!p2m_is_valid(*entry));
>>>
>>> -    page = alloc_domheap_page(NULL, 0);
>>> +    /* If cache coloring is enabled, p2m tables are allocated using the 
>>> domain
>>> +     * coloring configuration to prevent cache interference. */
>>> +    if ( IS_ENABLED(CONFIG_CACHE_COLORING) )
>>> +        page = alloc_domheap_page(p2m->domain, MEMF_no_refcount);
>>
>> Are you sure you don't mean MEMF_no_owner (which implies MEMF_no_refcount)
>> here? And then ...
> 
> Yes. I've already fixed it in the v4 that I'm working on right now.
> 
>>> +    else
>>> +        page = alloc_domheap_page(NULL, 0);
>>
>> ... is it really necessary to keep the two cases separate?
> 
> Not sure. I don't know the reason behind the original code.

The difference becomes noticable in the NUMA case, which is only being
worked on for Arm. Yet that means that converting the original call in
a prereq patch, stating the NUMA effect as the justification, might be
the way to go. (See commit a7596378e454, which introduces the flag.)

>>> @@ -1926,24 +2106,49 @@ static unsigned long avail_heap_pages(
>>>  void __init end_boot_allocator(void)
>>>  {
>>>      unsigned int i;
>>> +    unsigned long buddy_pages;
>>>
>>> -    /* Pages that are free now go to the domain sub-allocator. */
>>> -    for ( i = 0; i < nr_bootmem_regions; i++ )
>>> +    buddy_pages = PFN_DOWN(buddy_alloc_size);
>>
>> Any reason this can't be the initializer of the variable?
> 
> Nope. The end_boot_allocator() changes are a bit messy. In v4 I'm doing
> things more nicely, moving everything in init_color_heap_pages().
> 
>>> +    if ( !IS_ENABLED(CONFIG_CACHE_COLORING) )
>>>      {
>>> -        struct bootmem_region *r = &bootmem_region_list[i];
>>> -        if ( (r->s < r->e) &&
>>> -             (phys_to_nid(pfn_to_paddr(r->s)) == cpu_to_node(0)) )
>>> +        /* Pages that are free now go to the domain sub-allocator. */
>>> +        for ( i = 0; i < nr_bootmem_regions; i++ )
>>>          {
>>> -            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
>>> -            r->e = r->s;
>>> -            break;
>>> +            struct bootmem_region *r = &bootmem_region_list[i];
>>> +            if ( (r->s < r->e) &&
>>
>> Even if you're only re-indenting the original code (which personally I'd
>> prefer if it was avoided), please add the missing blank line between
>> declaration and statement here.
>>
>>> +                (phys_to_nid(pfn_to_paddr(r->s)) == cpu_to_node(0)) )
>>> +            {
>>> +                init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
>>> +                r->e = r->s;
>>> +                break;
>>> +            }
>>>          }
>>>      }
>>> +
>>>      for ( i = nr_bootmem_regions; i-- > 0; )
>>>      {
>>> -        struct bootmem_region *r = &bootmem_region_list[i];
>>> +        struct bootmem_region *r;
>>> +
>>> +        if ( IS_ENABLED(CONFIG_CACHE_COLORING) )
>>> +            r = &bootmem_region_list[nr_bootmem_regions - i - 1];
>>
>> If you want to handle things low-to-high, why don't you alter the
>> earlier loop rather than skipping (and re-indenting) it?
> 
> Yes, you're right.
> 
>> However,
>> considering that in alloc_color_heap_page() you prefer pages at
>> higher addresses, I continue to find it odd that here you want to
>> process low address pages first.
> 
> It doesn't matter if alloc_color_heap_page() returns higher or lower
> addresses. The important thing is to create a sorted list so that min or
> max are easily found. Having a sorted list means that it's easier to insert
> pages if their addresses are always increasing or always decreasing, so that
> starting either from the head or from the tail, the position where to insert
> is found in O(1). If regions are processed high-to-low but pages of each
> region are instead low-to-high, the always-decreasing/always-increasing
> property doesn't hold anymore and the linear search needs to be repeated
> multiple times. This problem can be solved in many ways and doing
> everything low-to-high is one solution.

Of course. But please also put code churn in the set of a criteria to
apply. Page lists are doubly linked, so it shouldn't matter whether
you insert forwards or backwards.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.