[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] xen/balloon: add late_initcall_sync() for initial ballooning done


  • To: Juergen Gross <jgross@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-doc@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
  • From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • Date: Thu, 4 Nov 2021 11:55:34 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6iI3dZjA1r1YLhi0I3cKEArQZIA2GcYQ+WPrNcay3+E=; b=BTNNEjkniLxsNZx4Li5MQtNC6KW5Bs+8qMrbLKEU0yADIVI0SsuAmlhUBAaNdgaCmZjYyDSlytYKwZlOQFV0cva1hTRnH0CV/bFJ6V07Yh+VxFrpDD1u2LBqqoijjZlAXC3Dur8Y58Mn+Tsu9BLXub/rEQXPCXhr60NjMFRqt+A4eyKBn/My34/DE69hz8q7q/Y6W/81ENHd0rMzwtAOICziOzmRXuCDbCWHUFyw1zQCoREneg/aQfguOdqZpyeSXnsIeb3joB4OtpMFRXeyQzEAnthMTTFE0BPhTGxrgKlRdjxnyLvhkiVGHdio+LFsn3YeGXQKt++GTFoWMiA9ng==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jxQ8uvZLNyZFbo409FopsHq7kE1hoVsGJ3AyeFLPt6fRr+s+1af4bFkVzIju+EAiSYmkl1hEptGUeFGoBN6/TBIKzdOsr2nqIFBKGHX29S2JYk1+Dtyb6Ett0jiEvcWZK7cQ4SlmUyaJgTdCcO5GfApGmoEibAEZXb+qxbjw7CblDvPlQ4dTe33VdYVojk7lniu6rO/+vwLPS2ccvHF4THIHqygbOjdi52c9dFYKmAvEpUl/ZvjfYY03BnyX+KsoaBKN5QTMLcphWAzwD6osPmXgF4pVcLrnewq9zH4EMXjPbkih46VcjPmlFZDa0QYnAmMHnUez8iRrZdL5yrB4vg==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=oracle.com;
  • Cc: Jonathan Corbet <corbet@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, stable@xxxxxxxxxxxxxxx, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 04 Nov 2021 15:57:06 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 11/3/21 9:55 PM, Boris Ostrovsky wrote:

On 11/2/21 5:19 AM, Juergen Gross wrote:
When running as PVH or HVM guest with actual memory < max memory the
hypervisor is using "populate on demand" in order to allow the guest
to balloon down from its maximum memory size. For this to work
correctly the guest must not touch more memory pages than its target
memory size as otherwise the PoD cache will be exhausted and the guest
is crashed as a result of that.

In extreme cases ballooning down might not be finished today before
the init process is started, which can consume lots of memory.

In order to avoid random boot crashes in such cases, add a late init
call to wait for ballooning down having finished for PVH/HVM guests.

Warn on console if initial ballooning fails, panic() after stalling
for more than 3 minutes per default. Add a module parameter for
changing this timeout.

Cc: <stable@xxxxxxxxxxxxxxx>
Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Juergen Gross <jgross@xxxxxxxx>



Reviewed-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>


This appears to have noticeable effect on boot time (and boot experience in 
general).


I have


  memory=1024
  maxmem=8192


And my boot time (on an admittedly slow box) went from 33 to 45 seconds. And 
boot pauses in the middle while it is waiting for ballooning to complete.


[    5.062714] xen:balloon: Waiting for initial ballooning down having finished.
[    5.449696] random: crng init done
[   34.613050] xen:balloon: Initial ballooning down finished.


So at least I think we should consider bumping log level down from info.



-boris




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.