[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/9] x86/PVH: improve Dom0 memory size calculation


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 29 Sep 2021 12:53:22 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=EkUe7PDWn48Y6jUAL8tFmgb8Z61St5FYaV0Td7l5Dl0=; b=SmtBcb9HIVuIzMQaV6XEBBvsKgMh1lOe61tWcnncah/afh7IX8K/Z0vKXqQwEFzkH4hynZNxamtI7YjRSMphvSSOIwSFUig2zpau+Icggkdli8Zzsg2jia/ZMxI+A2VjWBJQJ1Alg7TNfUtuEvWVYqasBfLhDS2BKVCXuA7rXb/dXqc8kHV9yxfuWmC9cxzEqLYyHWPq7W5wg2+oXg7dezOfA2GnYf9NUmDYdX10/Vum7Q5eKDdq14eD1D0mjFAmgMCTwirUGFLWaEDTo7CaBbPUl/os2l+AC3HDb2ZD/xbsKaewXc8PbWYvnudSabzMJF4tUH5CLEFivtHn7+CzFQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FQh+EArVp4u1VKtoH7ngVnuvRYmo31hFcR/vdfMD3YgbOjOb6xnwqRev5//CSeHggyEu2kSH23GoBX5rzAOyJreubO85C5DNq7QAWSQEv/INYW6y9UQapk6pl9Ta2ifultUy310j8z0rhYHYZ0d92+UOweS/x9klT5nn8DewjFoeQChZlCfMuub/k/GvW0D9tdch/prAVNaZiEgpIy5tWhrthLuhcw4uwsx4RnYUkj5baMwvKLzp8OyoHHI1rh+IGCrPQFP8fLQn0zp3ryJc3V1dBWU1q5Gqsb9enZu+kZjK6SCD4GRK250+XPKcLhTp+unZmfjPQMjqwJb9N35JJw==
  • Authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 29 Sep 2021 10:53:43 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 22.09.2021 13:59, Roger Pau Monné wrote:
> On Tue, Sep 21, 2021 at 09:16:44AM +0200, Jan Beulich wrote:
>> @@ -337,18 +337,23 @@ unsigned long __init dom0_compute_nr_pag
>>          avail -= d->max_vcpus - 1;
>>  
>>      /* Reserve memory for iommu_dom0_init() (rough estimate). */
>> -    if ( is_iommu_enabled(d) )
>> +    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
>>      {
>>          unsigned int s;
>>  
>>          for ( s = 9; s < BITS_PER_LONG; s += 9 )
>> -            avail -= max_pdx >> s;
>> +            iommu_pages += max_pdx >> s;
>> +
>> +        avail -= iommu_pages;
>>      }
>>  
>> -    need_paging = is_hvm_domain(d) &&
>> -        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
>> +    need_paging = is_hvm_domain(d)
>> +                  ? !iommu_use_hap_pt(d) || !paging_mode_hap(d)
>> +                  : opt_dom0_shadow;
>>      for ( ; ; need_paging = false )
>>      {
>> +        unsigned long paging_pages;
>> +
>>          nr_pages = get_memsize(&dom0_size, avail);
>>          min_pages = get_memsize(&dom0_min_size, avail);
>>          max_pages = get_memsize(&dom0_max_size, avail);
>> @@ -377,11 +382,20 @@ unsigned long __init dom0_compute_nr_pag
>>          nr_pages = min(nr_pages, max_pages);
>>          nr_pages = min(nr_pages, avail);
>>  
>> -        if ( !need_paging )
>> -            break;
>> +        paging_pages = paging_mode_enabled(d) || need_paging
>> +                       ? dom0_paging_pages(d, nr_pages) : 0;
>>  
>>          /* Reserve memory for shadow or HAP. */
>> -        avail -= dom0_paging_pages(d, nr_pages);
>> +        if ( !need_paging )
>> +        {
>> +            if ( paging_pages <= iommu_pages )
>> +                break;
>> +
>> +            avail -= paging_pages - iommu_pages;
>> +        }
>> +        else
>> +            avail -= paging_pages;
>> +        iommu_pages = paging_pages;
>>      }
> 
> I always found this loop extremely confusing to reason about. Now that
> we account for the iommu page tables using separate logic, do we
> really need a loop here?
> 
> In fact I would suggest something like:
> 
> unsigned long cpu_pages = 0;
> 
> if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
> {
>     unsigned int s;
> 
>     for ( s = 9; s < BITS_PER_LONG; s += 9 )
>         iommu_pages += max_pdx >> s;
> }
> 
> [perform all the nr_pages adjustments]
> 
> if ( paging_mode_enabled(d) ||
>      opt_dom0_shadow /* shadow paging gets enabled later for PV dom0. */ )
>     cpu_pages = dom0_paging_pages(d, nr_pages);
> 
> if ( is_hvm_domain(d) && iommu_use_hap_pt(d) && paging_mode_hap(d) )
>     avail -= max(iommu_pages, cpu_pages);
> else
>     avail -= cpu_pages + iommu_pages;
> 
> There will be a slight over estimation of cpu_pages, as the value
> passed in doesn't account for the iommu pages in case they are used,
> but still it's better to over estimate than to under estimate.

Overestimating cpu_pages isn't a primary concern, I agree. But we
want to get the min/max clamping reasonably close. Therefore, while
dropping the loop as you suggest I've introduced two instances of
that logic, before and after calculating cpu_pages. I'll see to
send out v4 soonish (provided the result actually works for, in
particular, the case that I needed the fix for in the first place).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.