[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue
When the kdump/kexec is enabled at HVM VM side, to panic kernel will trap to xen side with reason=soft_reset. As a result, the xen will reboot the VM with the kdump kernel. Unfortunately, when the VM is panic with below command line ... "taskset -c 33 echo c > /proc/sysrq-trigger" ... the kdump kernel is panic at early stage ... PANIC: early exception 0x0e IP 10:ffffffffa8c66876 error 0 cr2 0x20 [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc5xen #1 [ 0.000000] Hardware name: Xen HVM domU [ 0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0 ... ... [ 0.000000] RSP: 0000:ffffffffaa203e20 EFLAGS: 00010082 ORIG_RAX: 0000000000000000 [ 0.000000] RAX: 0000000000000003 RBX: 0000000000010000 RCX: 00000000ffffdfff [ 0.000000] RDX: 0000000000000003 RSI: 00000000ffffdfff RDI: 0000000000000020 [ 0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09: 0000000000000001 [ 0.000000] R10: ffffffffaa203e00 R11: ffffffffaa203c70 R12: 0000000040000004 [ 0.000000] R13: ffffffffaa203e5c R14: ffffffffaa203e58 R15: 0000000000000000 [ 0.000000] FS: 0000000000000000(0000) GS:ffffffffaa95e000(0000) knlGS:0000000000000000 [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.000000] CR2: 0000000000000020 CR3: 00000000ec9e0000 CR4: 00000000000406a0 [ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.000000] Call Trace: [ 0.000000] ? xen_init_time_common+0x11/0x55 [ 0.000000] ? xen_hvm_init_time_ops+0x23/0x45 [ 0.000000] ? xen_hvm_guest_init+0x214/0x251 [ 0.000000] ? 0xffffffffa8c00000 [ 0.000000] ? setup_arch+0x440/0xbd6 [ 0.000000] ? start_kernel+0x6a/0x689 [ 0.000000] ? secondary_startup_64_no_verify+0xc2/0xcb This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info' embedded inside 'shared_info' during early stage until xen_vcpu_setup() is used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address. The 1st patch is to fix the issue at VM kernel side. However, we may observe clock drift at VM side due to the issue at xen hypervisor side. This is because the pv vcpu_time_info is not updated when VCPUOP_register_vcpu_info. The 2nd patch is to force_update_vcpu_system_time() at xen side when VCPUOP_register_vcpu_info, to avoid the VM clock drift during kdump kernel boot. I did test the fix by backporting the 2nd patch to a prior old xen version. This is because I am not able to use soft_reset successfully with mainline xen. I have encountered below error when testing soft_reset with mainline xen. Please let me know if there is any know issue/solution. # xl -v create -F vm.cfg ... ... ... ... Domain 1 has shut down, reason code 5 0x5 Action for shutdown reason code 5 is soft-reset Done. Rebooting now xc: error: Failed to set d1's policy (err leaf 0xffffffff, subleaf 0xffffffff, msr 0xffffffff) (17 = File exists): Internal error libxl: error: libxl_cpuid.c:488:libxl__cpuid_legacy: Domain 1:Failed to apply CPUID policy: File exists libxl: error: libxl_create.c:1573:domcreate_rebuild_done: Domain 1:cannot (re-)build domain: -3 libxl: error: libxl_xshelp.c:201:libxl__xs_read_mandatory: xenstore read failed: `/libxl/1/type': No such file or directory libxl: warning: libxl_dom.c:53:libxl__domain_type: unable to get domain type for domid=1, assuming HVM Thank you very much! Dongli Zhang
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |