[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
The sched_clock() can be used very early since commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition, with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from 0"), kdump kernel in Xen HVM guest may panic at very early stage when accessing &__this_cpu_read(xen_vcpu)->time as in below: setup_arch() -> init_hypervisor_platform() -> x86_init.hyper.init_platform = xen_hvm_guest_init() -> xen_hvm_init_time_ops() -> xen_clocksource_read() -> src = &__this_cpu_read(xen_vcpu)->time; This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info' embedded inside 'shared_info' during early stage until xen_vcpu_setup() is used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address. However, when Xen HVM guest panic on vcpu >= 32, since xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic. This patch delays xen_hvm_init_time_ops() to later in xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is registered when the boot vcpu is >= 32. Another option is to always delay xen_hvm_init_time_ops() for any vcpus (including vcpu=0). Since to delay xen_hvm_init_time_ops() may lead to clock backward issue, it is preferred to avoid that for regular boot (The pv_sched_clock=native_sched_clock() is used at the very beginning until xen_sched_clock() is registered). That requires to adjust xen_sched_clock_offset. That's why we only delay xen_hvm_init_time_ops() for vcpu>=32. This issue can be reproduced on purpose via below command at the guest side when kdump/kexec is enabled: "taskset -c 33 echo c > /proc/sysrq-trigger" Reference: https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html Cc: Joe Jin <joe.jin@xxxxxxxxxx> Signed-off-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx> --- Changed since v1: - Add commit message to explain why xen_hvm_init_time_ops() is delayed for any vcpus. (Suggested by Boris Ostrovsky) - Add a comment in xen_hvm_smp_prepare_boot_cpu() referencing the related code in xen_hvm_guest_init(). (suggested by Juergen Gross) arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++- arch/x86/xen/smp_hvm.c | 8 ++++++++ 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c index e68ea5f4ad1c..7734dec52794 100644 --- a/arch/x86/xen/enlighten_hvm.c +++ b/arch/x86/xen/enlighten_hvm.c @@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void) WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm)); xen_unplug_emulated_devices(); x86_init.irqs.intr_init = xen_init_IRQ; - xen_hvm_init_time_ops(); + + /* + * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info' + * and the VM would use them until xen_vcpu_setup() is used to + * allocate/relocate them at arbitrary address. + * + * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS, + * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access + * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic. + * + * Therefore we delay xen_hvm_init_time_ops() to + * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS. + */ + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS) + pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n", + xen_vcpu_nr(0)); + else + xen_hvm_init_time_ops(); + xen_hvm_init_mmu_ops(); #ifdef CONFIG_KEXEC_CORE diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c index 6ff3c887e0b9..f99043df8bb5 100644 --- a/arch/x86/xen/smp_hvm.c +++ b/arch/x86/xen/smp_hvm.c @@ -19,6 +19,14 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void) */ xen_vcpu_setup(0); + /* + * The xen_hvm_init_time_ops() is delayed from + * xen_hvm_guest_init() to here to avoid panic when the kernel + * boots from vcpu>=MAX_VIRT_CPUS (32). + */ + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS) + xen_hvm_init_time_ops(); + /* * The alternative logic (which patches the unlock/lock) runs before * the smp bootup up code is activated. Hence we need to set this up -- 2.17.1
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |