[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH] [RFC] Fix a small window on CPU online/offline
This is a RFC patch for a small window on CPU online/offline. It is not a clean solution, I try to send it out before I finish checking all the related code, so that I can get feedback from community. Currently there is a small window on CPU online/offline. During take_cpu_down() in stop_machine_run() context, the CPU is marked offline and irq is disabled. But it is only at play_dead(), in idle_task_exit() from cpu_exit_clear(), the offlining CPU try to sync the lazy exec states. The window is, when play_dead(), the stop_machine_run() is done already, and the vcpu whose context is out-of-sync may be scheduled on another CPU. This may cause several issues: a) When the vcpu is scheduled on another CPU, it will try to sync the context on the original CPU, through flush_tlb_mask, as following code in context_switch(). Because the original CPU is marked as offline and irq disabled, it will hang in flush_area_mask. I try to send patch 21079:8ab60a883fd5 to avoid the hang. if ( unlikely(!cpu_isset(cpu, dirty_mask) && !cpus_empty(dirty_mask)) ) { /* Other cpus call __sync_lazy_execstate from flush ipi handler. */ flush_tlb_mask(&dirty_mask); } b) However, changeset 21079 is not the right solution still, although the patch itself is ok. With this changeset, system will not hang. But the vCPU's context is not synced. c) More is, when the offlining CPU execute the idle_task_exit(), it may try to re-sync the vcpu context with the guest, this will clobber the running vCPU. The following code try to sync the vcpu context in stop_machine_run() context, so that the vCPU will get the the context synced. However, it still not resolve issue c. I'm considering to mark the curr_vcpu() to be idle also, so that idle_task_exit() will not try to sync context again, but I suspect that is not a right way. Any suggestion? BTW, the flush_local is to make sure we flush all TLB context, so that when CPU online again, there is no garbage on the CPU, especially if the CPU has no deep C state. --jyh diff -r ebd84be3420a xen/arch/x86/smpboot.c --- a/xen/arch/x86/smpboot.c Tue Mar 30 18:31:39 2010 +0100 +++ b/xen/arch/x86/smpboot.c Thu Apr 01 16:47:57 2010 +0800 @@ -34,6 +34,7 @@ * Rusty Russell : Hacked into shape for new "hotplug" boot process. */ #include <xen/config.h> +#include <asm/i387.h> #include <xen/init.h> #include <xen/kernel.h> #include <xen/mm.h> @@ -1308,6 +1309,22 @@ int __cpu_disable(void) cpu_disable_scheduler(); + + if ( !is_idle_vcpu(this_cpu(curr_vcpu)) ) + { + struct cpu_user_regs *stack_regs = guest_cpu_user_regs(); + struct vcpu *v; + + v = this_cpu(curr_vcpu); + memcpy(&v->arch.guest_context.user_regs, + stack_regs, + CTXT_SWITCH_STACK_BYTES); + unlazy_fpu(v); + current->arch.ctxt_switch_from(v); + } + + flush_local(FLUSH_CACHE | FLUSH_TLB_GLOBAL |FLUSH_TLB); + return 0; } _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |