[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] xen/balloon: add late_initcall_sync() for initial ballooning done


  • To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • Date: Fri, 29 Oct 2021 19:44:31 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gfoXfXhQS40ektirZEXsfnYNyX2cBA+sq9UE/J+mvvU=; b=eSgQB0hJtZKCEOGGcskFRhOtmtQwJ54feV1JktURfBsqndsyOetDlrjtTpC8+Nu/yq5ebz4eleS226khvxGBcDoi/gPXHKs5pOd0oIkSEPDz/Sjlf6DV9P3D2k2/lemChYSk2djhV5/Ur3R3ZCjKEbrKtdxhMcTnijqXnjJnWSG/TRnFlSNrH1JW4M36pa7j7Sj3a5UubRE3NQ1zVGIbJeZMVvQAKKoK+6Oox/+71XH0gKSk4ptCrnhJkT2Mv650WEGRTd/lWnoXnhyiHxkiPhWw/Te1QXj12SFSOsxPbcr/huyXNEicpXdMTBeM3tqzQDAJG51UHZjYSWps4LM5TA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vo77w0zdKOd+qf/9/b2ecr7mlgNy03KUlUwZmHY/JvoPtPOgBh4B2ZcaBmB8GKmZ/IFGfhpQnTPOcQgIvx4yjde13mGDU4sEzDEJ2Wg8AQLUaNmSOEionDx4AgHcLsS2qpPu7Onl+DGOnFFQkkE63PV2Rrj7/8KIzZoz54FNU00qVij7TvqIWmWmaVmw5o5RXJ5sg1H29zcbuPrTUteQcZUdo6rjqT1VD7Fm0vDnloAwYwvamuncOROkZ3CYX6fPB0SWAnsjAOgjQ4w/Q35QDXqiF1f1jylKzv4MeTEbxdJTexxbRFxDqh3AsdkkhfBDib1lUqxxkILduvgvClJBZA==
  • Authentication-results: invisiblethingslab.com; dkim=none (message not signed) header.d=none;invisiblethingslab.com; dmarc=none action=none header.from=oracle.com;
  • Cc: Juergen Gross <jgross@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, stable@xxxxxxxxxxxxxxx
  • Delivery-date: Fri, 29 Oct 2021 23:45:20 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 10/29/21 6:18 PM, Marek Marczykowski-Górecki wrote:
On Fri, Oct 29, 2021 at 05:46:18PM -0400, Boris Ostrovsky wrote:
On 10/29/21 10:20 AM, Juergen Gross wrote:
--- a/Documentation/ABI/stable/sysfs-devices-system-xen_memory
+++ b/Documentation/ABI/stable/sysfs-devices-system-xen_memory
@@ -84,3 +84,13 @@ Description:
                Control scrubbing pages before returning them to Xen for others 
domains
                use. Can be set with xen_scrub_pages cmdline
                parameter. Default value controlled with 
CONFIG_XEN_SCRUB_PAGES_DEFAULT.
+
+What:          /sys/devices/system/xen_memory/xen_memory0/boot_timeout
+Date:          November 2021
+KernelVersion: 5.16
+Contact:       xen-devel@xxxxxxxxxxxxxxxxxxxx
+Description:
+               The time (in seconds) to wait before giving up to boot in case
+               initial ballooning fails to free enough memory. Applies only
+               when running as HVM or PVH guest and started with less memory
+               configured than allowed at max.

How is this going to be used? We only need this during boot.


-               state = update_schedule(state);
+               balloon_state = update_schedule(balloon_state);

Now that balloon_state has whole file scope it can probably be updated inside 
update_schedule().


+       while ((credit = current_credit()) < 0) {
+               if (credit != last_credit) {
+                       last_changed = jiffies;
+                       last_credit = credit;
+               }
+               if (balloon_state == BP_ECANCELED) {

What about other states? We are really waiting for BP_DONE, aren't we?
BP_DONE is set also as an intermediate step:

                        balloon_state = decrease_reservation(n_pages,
                                                             GFP_BALLOON);
                        if (balloon_state == BP_DONE && n_pages != -credit &&
                             n_pages < totalreserve_pages)
                                balloon_state = BP_EAGAIN;

It would be bad to finish waiting in this case.


RIght, but if we were to say 'if (balloon_state != BP_DONE)' the worst that can 
happen is that we will continue on to the next iteration without warning and/or 
panicing. Of course, there is a chance thaton the next iteration the same thing 
will happen but I think chances of hitting this race every time are infinitely 
low. We can also check for current_credit() again.


The question is whether we do want to continue waiting if we are in BP_AGAIN. I 
don't think BP_WAIT is possible in this case although this may change in the 
future and we will forget to update this code.


-boris




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.