[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/balloon: add late_initcall_sync() for initial ballooning done



On Thu, Oct 28, 2021 at 12:59:52PM +0200, Juergen Gross wrote:
> When running as PVH or HVM guest with actual memory < max memory the
> hypervisor is using "populate on demand" in order to allow the guest
> to balloon down from its maximum memory size. For this to work
> correctly the guest must not touch more memory pages than its target
> memory size as otherwise the PoD cache will be exhausted and the guest
> is crashed as a result of that.
> 
> In extreme cases ballooning down might not be finished today before
> the init process is started, which can consume lots of memory.
> 
> In order to avoid random boot crashes in such cases, add a late init
> call to wait for ballooning down having finished for PVH/HVM guests.
> 
> Cc: <stable@xxxxxxxxxxxxxxx>
> Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>

It may happen that initial balloon down fails (state==BP_ECANCELED). In
that case, it waits indefinitely. I think it should rather report a
failure (and panic? it's similar to OOM before PID 1 starts, so rather
hard to recover), instead of hanging.

Anyway, it does fix the boot crashes.

> ---
>  drivers/xen/balloon.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 3a50f097ed3e..d19b851c3d3b 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -765,3 +765,23 @@ static int __init balloon_init(void)
>       return 0;
>  }
>  subsys_initcall(balloon_init);
> +
> +static int __init balloon_wait_finish(void)
> +{
> +     if (!xen_domain())
> +             return -ENODEV;
> +
> +     /* PV guests don't need to wait. */
> +     if (xen_pv_domain() || !current_credit())
> +             return 0;
> +
> +     pr_info("Waiting for initial ballooning down having finished.\n");
> +
> +     while (current_credit())
> +             schedule_timeout_interruptible(HZ / 10);
> +
> +     pr_info("Initial ballooning down finished.\n");
> +
> +     return 0;
> +}
> +late_initcall_sync(balloon_wait_finish);
> -- 
> 2.26.2
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.