[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 1/6] x86/PVH: improve Dom0 memory size calculation


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 22 Oct 2021 11:58:53 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FenLaUIXSE7u3OniYmEEaoPqXXNQJjXPZQlKQi+YVY4=; b=Wz3w/zPgPVod/MkJYVEw4ENS5n35jEj6br1V5uEdELiICHLhyHH0JY0AzWGCjCElmUnRpcW/67vSP7gDnikpcBit1XlAYV1LWBmZTHu5hS1HOKGThzsezUBybBhWS4t9WOSgTyvQ5OTsvWJTnuYnHqRb8sCEstBY5wu9f1IFryswiC85zYN1Hy/TJ/sRPQ9K/eldOwNQ0w2UUgXYG+Vj1Xmy1BpMLxWjc21+VjLq7S4R6vAY5WRWcK7SjvVveH5K2eZMuNoHhvTG/ztHdE9UgBJRibnk17N+0uL71wNv3Oab/TX1rwnxgm9wo58kbAfym6mX4iyBfZXyN4NJCIu9Xw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Wny2NuhX6Kd1BQXkCcHmmpYeKZb0HPenzwlvs9Pn7hmSSeie2nhyCnjjrrgaQVGvOwgbKrPdpBYLhcw3BEJed0mqktt9XrMrYuiySwxCqRD8FrpEdjEpTHC3PxpVRsjCA1zgQ0N948a0lyQGlucXuvDTfQx9fLxxu2tais1iNw/4ndmkDx8fMYBTKOP0Sc8ErWcqsB1qK/oQKVNCVFvnE+chUN4aTIKgrvis7j1/drLzVlbEedt9EFoaH1CWTcp/B5IzdK5AHSwIJzsE6ht/HiQjkM4y6kTxOyc6eg3U6vaiUVBoYocMSuovNGbtD6Uy7ddc2Vo1BkHwoxJGkd2jZQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 22 Oct 2021 09:59:00 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 22.10.2021 11:55, Roger Pau Monné wrote:
> On Wed, Sep 29, 2021 at 03:13:24PM +0200, Jan Beulich wrote:
>> Assuming that the accounting for IOMMU page tables will also take care
>> of the P2M needs was wrong: dom0_paging_pages() can determine a far
>> higher value, high enough for the system to run out of memory while
>> setting up Dom0. Hence in the case of shared page tables the larger of
>> the two values needs to be used (without shared page tables the sum of
>> both continues to be applicable).
>>
>> To not further complicate the logic, eliminate the up-to-2-iteration
>> loop in favor of doing a few calculations twice (before and after
>> calling dom0_paging_pages()). While this will lead to slightly too high
>> a value in "cpu_pages", it is deemed better to account a few too many
>> than a few too little.
>>
>> Also uniformly use paging_mode_enabled(), not is_hvm_domain().
>>
>> While there also account for two further aspects in the PV case: With
>> "iommu=dom0-passthrough" no IOMMU page tables would get allocated, so
>> none need accounting for. And if shadow mode is to be enabled, setting
>> aside a suitable amount for the P2M pool to get populated is also
>> necessary (i.e. similar to the non-shared-page-tables case of PVH).
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> I wonder whether this isn't enough to drop the "PVH dom0 without
>> dom0_mem" warning.
>>
>> --- a/xen/arch/x86/dom0_build.c
>> +++ b/xen/arch/x86/dom0_build.c
>> @@ -318,8 +318,7 @@ unsigned long __init dom0_compute_nr_pag
>>      struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
>>  {
>>      nodeid_t node;
>> -    unsigned long avail = 0, nr_pages, min_pages, max_pages;
>> -    bool need_paging;
>> +    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 
>> 0;
>>  
>>      /* The ordering of operands is to work around a clang5 issue. */
>>      if ( CONFIG_DOM0_MEM[0] && !dom0_mem_set )
>> @@ -337,53 +336,65 @@ unsigned long __init dom0_compute_nr_pag
>>          avail -= d->max_vcpus - 1;
>>  
>>      /* Reserve memory for iommu_dom0_init() (rough estimate). */
>> -    if ( is_iommu_enabled(d) )
>> +    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
>>      {
>>          unsigned int s;
>>  
>>          for ( s = 9; s < BITS_PER_LONG; s += 9 )
>> -            avail -= max_pdx >> s;
>> +            iommu_pages += max_pdx >> s;
>> +
>> +        avail -= iommu_pages;
>> +    }
>> +
>> +    nr_pages = get_memsize(&dom0_size, avail);
>> +
>> +    /*
>> +     * If allocation isn't specified, reserve 1/16th of available memory for
>> +     * things like DMA buffers. This reservation is clamped to a maximum of
>> +     * 128MB.
>> +     */
>> +    if ( !nr_pages )
>> +    {
>> +        nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
>> +                            : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
>> +        if ( paging_mode_enabled(d) )
>> +            /*
>> +             * Temporary workaround message until internal (paging) memory
>> +             * accounting required to build a pvh dom0 is improved.
>> +             */
>> +            printk("WARNING: PVH dom0 without dom0_mem set is still 
>> unstable. "
>> +                   "If you get crashes during boot, try adding a dom0_mem 
>> parameter\n");
>>      }
>>  
>> -    need_paging = is_hvm_domain(d) &&
>> -        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
>> -    for ( ; ; need_paging = false )
>> +    if ( paging_mode_enabled(d) || opt_dom0_shadow )
> 
> Do we also need to account for opt_pv_l1tf_hwdom in case dom0 gets
> shadowing enabled during runtime?

Yes, we do, and I've added that to the check for v5 already.

> The rest LGTM, so:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Thanks, but as said in a reply to this just yesterday, this is buggy,
and a v5 is going to be needed anyway.

> I'm also fine if you want to remove the warning message at this time.

Okay, will do.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.