[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 00/11] Fix PM hibernation in Xen guests



On Fri, Aug 21, 2020 at 10:22:43PM +0000, Anchal Agarwal wrote:
> Hello,
> This series fixes PM hibernation for hvm guests running on xen hypervisor.
> The running guest could now be hibernated and resumed successfully at a
> later time. The fixes for PM hibernation are added to block and
> network device drivers i.e xen-blkfront and xen-netfront. Any other driver
> that needs to add S4 support if not already, can follow same method of
> introducing freeze/thaw/restore callbacks.
> The patches had been tested against upstream kernel and xen4.11. Large
> scale testing is also done on Xen based Amazon EC2 instances. All this testing
> involved running memory exhausting workload in the background.
>   
> Doing guest hibernation does not involve any support from hypervisor and
> this way guest has complete control over its state. Infrastructure
> restrictions for saving up guest state can be overcome by guest initiated
> hibernation.
>   
> These patches were send out as RFC before and all the feedback had been
> incorporated in the patches. The last v1 & v2 could be found here:
>   
> [v1]: https://lkml.org/lkml/2020/5/19/1312
> [v2]: https://lkml.org/lkml/2020/7/2/995
> All comments and feedback from v2 had been incorporated in v3 series.
> 
> Known issues:
> 1.KASLR causes intermittent hibernation failures. VM fails to resumes and
> has to be restarted. I will investigate this issue separately and shouldn't
> be a blocker for this patch series.
> 2. During hibernation, I observed sometimes that freezing of tasks fails due
> to busy XFS workqueuei[xfs-cil/xfs-sync]. This is also intermittent may be 1
> out of 200 runs and hibernation is aborted in this case. Re-trying hibernation
> may work. Also, this is a known issue with hibernation and some
> filesystems like XFS has been discussed by the community for years with not an
> effectve resolution at this point.
> 
> Testing How to:
> ---------------
> 1. Setup xen hypervisor on a physical machine[ I used Ubuntu 16.04 +upstream
> xen-4.11]
> 2. Bring up a HVM guest w/t kernel compiled with hibernation patches
> [I used ubuntu18.04 netboot bionic images and also Amazon Linux on-prem 
> images].
> 3. Create a swap file size=RAM size
> 4. Update grub parameters and reboot
> 5. Trigger pm-hibernation from within the VM
> 
> Example:
> Set up a file-backed swap space. Swap file size>=Total memory on the system
> sudo dd if=/dev/zero of=/swap bs=$(( 1024 * 1024 )) count=4096 # 4096MiB
> sudo chmod 600 /swap
> sudo mkswap /swap
> sudo swapon /swap
> 
> Update resume device/resume offset in grub if using swap file:
> resume=/dev/xvda1 resume_offset=200704 no_console_suspend=1
> 
> Execute:
> --------
> sudo pm-hibernate
> OR
> echo disk > /sys/power/state && echo reboot > /sys/power/disk
> 
> Compute resume offset code:
> "
> #!/usr/bin/env python
> import sys
> import array
> import fcntl
> 
> #swap file
> f = open(sys.argv[1], 'r')
> buf = array.array('L', [0])
> 
> #FIBMAP
> ret = fcntl.ioctl(f.fileno(), 0x01, buf)
> print buf[0]
> "
> 
> Aleksei Besogonov (1):
>   PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA
> 
> Anchal Agarwal (4):
>   x86/xen: Introduce new function to map HYPERVISOR_shared_info on
>     Resume
>   x86/xen: save and restore steal clock during PM hibernation
>   xen: Introduce wrapper for save/restore sched clock offset
>   xen: Update sched clock offset to avoid system instability in
>     hibernation
> 
> Munehisa Kamata (5):
>   xen/manage: keep track of the on-going suspend mode
>   xenbus: add freeze/thaw/restore callbacks support
>   x86/xen: add system core suspend and resume callbacks
>   xen-blkfront: add callbacks for PM suspend and hibernation
>   xen-netfront: add callbacks for PM suspend and hibernation
> 
> Thomas Gleixner (1):
>   genirq: Shutdown irq chips in suspend/resume during hibernation
> 
>  arch/x86/xen/enlighten_hvm.c      |   7 +++
>  arch/x86/xen/suspend.c            |  63 ++++++++++++++++++++
>  arch/x86/xen/time.c               |  15 ++++-
>  arch/x86/xen/xen-ops.h            |   3 +
>  drivers/block/xen-blkfront.c      | 122 
> ++++++++++++++++++++++++++++++++++++--
>  drivers/net/xen-netfront.c        |  96 +++++++++++++++++++++++++++++-
>  drivers/xen/events/events_base.c  |   1 +
>  drivers/xen/manage.c              |  46 ++++++++++++++
>  drivers/xen/xenbus/xenbus_probe.c |  96 +++++++++++++++++++++++++-----
>  include/linux/irq.h               |   2 +
>  include/xen/xen-ops.h             |   3 +
>  include/xen/xenbus.h              |   3 +
>  kernel/irq/chip.c                 |   2 +-
>  kernel/irq/internals.h            |   1 +
>  kernel/irq/pm.c                   |  31 +++++++---
>  kernel/power/user.c               |   7 ++-
>  16 files changed, 464 insertions(+), 34 deletions(-)
> 
> -- 
> 2.16.6
>
A gentle ping on the series in case there is any more feedback or can we plan to
merge this? I can then send the series with minor fixes pointed by tglx@

Thanks,
Anchal



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.