[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/9] x86/PVH: improve Dom0 memory size calculation


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 22 Sep 2021 13:59:33 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=jcxbyfjmLXpGC4OTqSxH5SWNRDCeJod75k5ifshS7ME=; b=DGrrB3zLtSSG3G+mPLfmzA6lelxCJEQ9GWEVq4KKdBKyIq9LeBFygm28I+XqG4c7fBYwzVsi6nYlZOrjgWOFZis/9pG/3ihgydV1fhs49MdpIQo5zFAiuTEoMa85c8pxSLx/aUPXS/lSP21lF9mJpU8XDFPY3OTe4d6PNWYSTK1PXZh5kGEJh8u2epmEehAspIaUl2GLQNkZ9u/UbqfqLneOebrrZVX98eXgOoUR1bRggv/jQMTuSOT9mRj6QacQZvHNjDLk74GTEjW421GvIwoggtLT9yV5dET4kPmjJYWYtjPBUskSDpVXTAtG36x4u4IjETms/n91M9ZEP9ivlw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jCfynWNEghLVfmU6NvvcQq3t5d09L1ulPV3NJL40J/tMe2jV+TjIB0+CR6F1vtSn06kLndCdwFbo7m/IMjfaTe57C9Cb/ZbHr/jd8sSgOaVN3hKC8Wqbio8PENPsxd7NOmfKw1yOFm6iwREiw9+1oxQhkMCbcQSHWO8j+rpHBhc3NzN8Qq4lTTZBDB5sN3+/mLbUDeYwOijZbWOJqd7fBxITbR+uXlXoq7pOLYUaSd/4POGScWsUPWTQDgCABIDWQ8Tx4WjrylUGdq8az7tKQnEc+bPB5aJHZetnt8ARyHx5OPgmbFRvmRiVVq480cLZ2u5VGfUyxyZzBMTmR+XadA==
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 22 Sep 2021 11:59:51 +0000
  • Ironport-data: A9a23:UFFDdK55dYESevzq/gqsxQxRtCbAchMFZxGqfqrLsTDasY5as4F+v mUaDG3XMvqKMWvwc9p/PYW1o05UucOBmIVmTFBt+X02Hi5G8cbLO4+Ufxz6V8+wwmwvb67FA +E2MISowBUcFyeEzvuV3zyIQUBUjclkfJKlYAL/En03FVIMpBsJ00o5wrZo2NQw27BVPivW0 T/Mi5yHULOa82Yc3lI8s8pvfzs24ZweEBtB1rAPTagjUG32zhH5P7pGTU2FFFPqQ5E8IwKPb 72rIIdVXI/u10xF5tuNyt4Xe6CRK1LYFVDmZnF+A8BOjvXez8CbP2lS2Pc0MC9qZzu1c99Zw +lOlJq6RwYTOZLSvLgibzJnKAFuFPgTkFPHCSDXXc27ykTHdz3nwul0DVFwNoodkgp1KTgQr 7pCcmlLN03dwbLtqF64YrAEasALNs7kMZlZonh95TrYEewnUdbIRKCiCdpwgWpp35wfQKa2i 8wxQyZtS036bwF0E14wA8McxuyJllzzfGgNwL6SjfVuuDWCpOBr65DyNPLFd9rMQt9a9m6Iq 2SD82nnDxUyMN2E1SHD4n+qnvXIny7wRMQVDrLQ3vxgjUCXx2cTIAYLTlb9qv684nNSQPoGd RZSoHB36/Fvqgr7FbERQiFUvlaPgh09XdBeEtc91x+plpXoyDmiJngtG2sphMMdiCMmedA7/ gbXxImwVWIz6OT9pWG1rejP/GjrUcQBBSpbP3ZVE1FdizX2iNxr1nryosBf/LlZZzEfMQr5x SyD5AM6jq8a5SLg//TmpQ2b695AS56gc+LU2uk1djn+hu+aTNT8D2BN1bQ8xawbRLt1tnHb4 BA5dzG2tYji961hcRCwrBglRun1t55pzwEwcXYwRsJ8plxBClaIfJxK4SEWGXqFxv0sIGezC GeK4Fs5zMYKYBOCMP8mC6rsWp9C5fWxSrzYugX8M4Mmjm5ZL1TcokmDpCe4ggjQraTbuftuY cvEKZn8Uyly5GYO5GPeetrxGIQDn0gW7WjSWYr631Kg17+fb2SSUrALLB2FaeVR0U9OiFy9H w93O5TYxhNBfvf5ZyWLo4cfIUpTdSowBIzsqtwRfemGe1I0FGYkAv7X4LUgZ406wPgFyraWp imwCh1C1V7ypXzbMgHWOHptX6ziAMRkpnUhMC1yYVvxgyo/YZyi5bs0focseeV17/RqyPN5F qFXe8iJDvlVZC7A/jARMcv0oIB4LUz5jgOSJSu1JjM4esc4FQDO/9bleCrp9TUPUXXr5Zdv/ eX421qCE5QZRglkAMLHU96Vzgu87SoHheZ/f0rUOd0PKk/ix5dndn7qhfgtLsBSdRianmmG1 xybCAszrPXWp9Nn68HAgK2Jotv7E+Z6GUYGTWDX4azvaHvf92unh4RBTPyJbXbWU2atoPeuY uBczvfdNvwbnQkV79ogQugzla9utcHyo7J6zxh/GCSZZlumPbpsP32a0JQdraZK3LJY5VO7V 0/nFgO24llV1BcJyGIsGTc=
  • Ironport-hdrordr: A9a23:H+QAG6gbirak4j2nIJP0qPenVXBQX0F13DAbv31ZSRFFG/FwyP rAoB1L73PJYWgqNU3I+ergBEGBKUmskqKdxbNhR4tKPTOWw1dASbsN0WKM+UyDJ8STzJ856U 4kSdkCNDSSNykFsS+Z2njALz9I+rDum8rJ9ISuvkuFDzsaE52Ihz0JdTpzeXcGIjWua6BJcK Z1saF81kadkDksH46GL0hAe9KGi8zAlZrgbxJDLxk76DOWhTftzLLhCRCX0joXTjsKmN4ZgC T4uj28wp/mn+Cwyxfa2WOWx5NKmOH5wt8GIMCXkMAaJhjllw7tToV8XL+puiwzvYiUmRsXue iJhy1lE9V46nvXcG3wiRzx2zP42DJr0HPmwU/wuwqrneXJABYBT+ZRj4NQdRXUr2A6ustn7a 5N12WF87JKEBLphk3Glpn1fiAvsnDxjWspkOYVgXAae5AZcqVtoYsW+14QOIscHRj99JssHI BVfY/hDc5tABCnhk3izytSKITGZAV3Iv7GeDlMhiWt6UkXoJgjpHFogPD2nR87heQAotd/lq P52gkBrsA6ciYsV9MPOA42e7rBNoX8e2O9DIusGyWUKEgmAQOEl3el2sR/2AmVEKZ4uKfa3q 6xFm9liQ==
  • Ironport-sdr: VynMwQLnE7NCGGc0ftO6ymvXmv1wD1Po1NkQkvdWWQ1elp6vTn+Ac8DyeENdl59A5e06hsq2Mb i99IaWAM6vBqV4HkIUweHpATNaLWDOiugffpeGOHsdsbuaTUXnAhPY3WKChaUzdsYsio+95GrD lX4rTKxvlHzzQO1KGlh8LhIdyXfKxVmfSjojM+PQlPYQQokmjeF0Yn6t7RB2nLj8QAPwuxEa7T dYY+A5H/P06zfz7epDog7OBw1zo7fmZe/8D728XkV/5uVFpLup6RQdhavjh5mXbypFX4CVdPgQ jhgQPugiXjEVcNQ+xtm610hn
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Sep 21, 2021 at 09:16:44AM +0200, Jan Beulich wrote:
> Assuming that the accounting for IOMMU page tables will also take care
> of the P2M needs was wrong: dom0_paging_pages() can determine a far
> higher value, high enough for the system to run out of memory while
> setting up Dom0. Hence in the case of shared page tables the larger of
> the two values needs to be used (without shared page tables the sum of
> both continues to be applicable).
> 
> While there also account for two further aspects in the PV case: With
> "iommu=dom0-passthrough" no IOMMU page tables would get allocated, so
> none need accounting for. And if shadow mode is to be enabled, setting
> aside a suitable amount for the P2M pool to get populated is also
> necessary (i.e. similar to the non-shared-page-tables case of PVH).
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -318,7 +318,7 @@ unsigned long __init dom0_compute_nr_pag
>      struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
>  {
>      nodeid_t node;
> -    unsigned long avail = 0, nr_pages, min_pages, max_pages;
> +    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
>      bool need_paging;
>  
>      /* The ordering of operands is to work around a clang5 issue. */
> @@ -337,18 +337,23 @@ unsigned long __init dom0_compute_nr_pag
>          avail -= d->max_vcpus - 1;
>  
>      /* Reserve memory for iommu_dom0_init() (rough estimate). */
> -    if ( is_iommu_enabled(d) )
> +    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
>      {
>          unsigned int s;
>  
>          for ( s = 9; s < BITS_PER_LONG; s += 9 )
> -            avail -= max_pdx >> s;
> +            iommu_pages += max_pdx >> s;
> +
> +        avail -= iommu_pages;
>      }
>  
> -    need_paging = is_hvm_domain(d) &&
> -        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
> +    need_paging = is_hvm_domain(d)
> +                  ? !iommu_use_hap_pt(d) || !paging_mode_hap(d)
> +                  : opt_dom0_shadow;
>      for ( ; ; need_paging = false )
>      {
> +        unsigned long paging_pages;
> +
>          nr_pages = get_memsize(&dom0_size, avail);
>          min_pages = get_memsize(&dom0_min_size, avail);
>          max_pages = get_memsize(&dom0_max_size, avail);
> @@ -377,11 +382,20 @@ unsigned long __init dom0_compute_nr_pag
>          nr_pages = min(nr_pages, max_pages);
>          nr_pages = min(nr_pages, avail);
>  
> -        if ( !need_paging )
> -            break;
> +        paging_pages = paging_mode_enabled(d) || need_paging
> +                       ? dom0_paging_pages(d, nr_pages) : 0;
>  
>          /* Reserve memory for shadow or HAP. */
> -        avail -= dom0_paging_pages(d, nr_pages);
> +        if ( !need_paging )
> +        {
> +            if ( paging_pages <= iommu_pages )
> +                break;
> +
> +            avail -= paging_pages - iommu_pages;
> +        }
> +        else
> +            avail -= paging_pages;
> +        iommu_pages = paging_pages;
>      }

I always found this loop extremely confusing to reason about. Now that
we account for the iommu page tables using separate logic, do we
really need a loop here?

In fact I would suggest something like:

unsigned long cpu_pages = 0;

if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
{
    unsigned int s;

    for ( s = 9; s < BITS_PER_LONG; s += 9 )
        iommu_pages += max_pdx >> s;
}

[perform all the nr_pages adjustments]

if ( paging_mode_enabled(d) ||
     opt_dom0_shadow /* shadow paging gets enabled later for PV dom0. */ )
    cpu_pages = dom0_paging_pages(d, nr_pages);

if ( is_hvm_domain(d) && iommu_use_hap_pt(d) && paging_mode_hap(d) )
    avail -= max(iommu_pages, cpu_pages);
else
    avail -= cpu_pages + iommu_pages;

There will be a slight over estimation of cpu_pages, as the value
passed in doesn't account for the iommu pages in case they are used,
but still it's better to over estimate than to under estimate.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.