Xen project Mailing List

Re: [Xen-devel] [PATCH 1/2] VMX: fix VMCS race on context-switch paths

To: "JBeulich@xxxxxxxx" <JBeulich@xxxxxxxx>

From: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>

Date: Wed, 15 Feb 2017 11:13:52 +0000

Accept-language: en-GB, en-US

Cc: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, "Kevin.Mayer@xxxxxxxx" <Kevin.Mayer@xxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, Anshul Makkar <anshul.makkar@xxxxxxxxxx>, "jun.nakajima@xxxxxxxxx" <jun.nakajima@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 15 Feb 2017 11:13:56 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHShq0qOEe6OB/G2kauuqWVv9SXi6FpzgaAgAAJPwCAAAPJgA==

Thread-topic: [PATCH 1/2] VMX: fix VMCS race on context-switch paths

On Wed, 2017-02-15 at 04:00 -0700, Jan Beulich wrote: > > > > On 15.02.17 at 11:27, <sergey.dyasli@xxxxxxxxxx> wrote: > > > > This is what I'm getting during the original test case (32 VMs reboot): > > > > (XEN) [ 1407.789329] Watchdog timer detects that CPU12 is stuck! > > (XEN) [ 1407.795726] ----[ Xen-4.6.1-xs-local x86_64 debug=n Not tainted > > ]---- > > (XEN) [ 1407.803774] CPU: 12 > > (XEN) [ 1407.806975] RIP: e008:[<ffff82d0801ea2a2>] > > vmx_vmcs_reload+0x32/0x50 > > (XEN) [ 1407.814926] RFLAGS: 0000000000000013 CONTEXT: hypervisor (d230v0) > > (XEN) [ 1407.822486] rax: 0000000000000000 rbx: ffff830079ee7000 rcx: > > 0000000000000000 > > (XEN) [ 1407.831407] rdx: 0000006f8f72ce00 rsi: ffff8329b3efbfe8 rdi: > > ffff830079ee7000 > > (XEN) [ 1407.840326] rbp: ffff83007bab7000 rsp: ffff83400fab7dc8 r8: > > 000001468e9e3ccc > > (XEN) [ 1407.849246] r9: ffff83403ffe7000 r10: 00000146c91c1737 r11: > > ffff833a9558c310 > > (XEN) [ 1407.858166] r12: ffff833a9558c000 r13: 000000000000000c r14: > > ffff83403ffe7000 > > (XEN) [ 1407.867085] r15: ffff82d080364640 cr0: 0000000080050033 cr4: > > 00000000003526e0 > > (XEN) [ 1407.876005] cr3: 000000294b074000 cr2: 000007fefd7ce150 > > (XEN) [ 1407.882599] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 > > cs: e008 > > (XEN) [ 1407.890938] Xen code around <ffff82d0801ea2a2> > > (vmx_vmcs_reload+0x32/0x50): > > (XEN) [ 1407.899277] 84 00 00 00 00 00 f3 90 <83> bf e8 05 00 00 ff 75 f5 > > e9 a0 fa ff ff f3 c3 > > (XEN) [ 1407.908679] Xen stack trace from rsp=ffff83400fab7dc8: > > (XEN) [ 1407.914982] ffff82d08016c58d 0000000000001000 0000000000000000 > > 0000000000000000 > > (XEN) [ 1407.923998] 0000000000000206 0000000000000086 0000000000000286 > > 000000000000000c > > (XEN) [ 1407.933017] ffff83007bab7058 ffff82d080364640 ffff83007bab7000 > > 00000146a7f26495 > > (XEN) [ 1407.942032] ffff830079ee7000 ffff833a9558cf84 ffff833a9558c000 > > ffff82d080364640 > > (XEN) [ 1407.951048] ffff82d08012fb8e ffff83400fabda98 ffff83400faba148 > > ffff83403ffe7000 > > (XEN) [ 1407.960067] ffff83400faba160 ffff83400fabda40 ffff82d080164305 > > 000000000000000c > > (XEN) [ 1407.969083] ffff830079ee7000 0000000001c9c380 ffff82d080136400 > > 000000440000011d > > (XEN) [ 1407.978101] 00000000ffffffff ffffffffffffffff ffff83400fab0000 > > ffff82d080348d00 > > (XEN) [ 1407.987116] ffff833a9558c000 ffff82d080364640 ffff82d08013311c > > ffff830079ee7000 > > (XEN) [ 1407.996134] ffff83400fab0000 ffff830079ee7000 ffff83403ffe7000 > > 00000000ffffffff > > (XEN) [ 1408.005151] ffff82d080167d35 ffff83007bab7000 0000000000000001 > > fffffa80077f9700 > > (XEN) [ 1408.014167] fffffa80075bf900 fffffa80077f9820 0000000000000000 > > 0000000000000000 > > (XEN) [ 1408.023184] fffffa8008889c00 0000000002fa1e78 0000003b6ed18d78 > > 0000000000000000 > > (XEN) [ 1408.032202] 00000000068e7780 fffffa80075ba790 fffffa80077f9848 > > fffff800027f9e80 > > (XEN) [ 1408.041220] 0000000000000001 000000fc00000000 fffff880042499c2 > > 0000000000000000 > > (XEN) [ 1408.050235] 0000000000000246 fffff80000b9cb58 0000000000000000 > > 80248e00e008e1f0 > > (XEN) [ 1408.059253] 00000000ffff82d0 80248e00e008e200 00000000ffff82d0 > > 80248e000000000c > > (XEN) [ 1408.068268] ffff830079ee7000 0000006f8f72ce00 00000000ffff82d0 > > (XEN) [ 1408.075638] Xen call trace: > > (XEN) [ 1408.079322] [<ffff82d0801ea2a2>] vmx_vmcs_reload+0x32/0x50 > > (XEN) [ 1408.086303] [<ffff82d08016c58d>] context_switch+0x85d/0xeb0 > > (XEN) [ 1408.093380] [<ffff82d08012fb8e>] schedule.c#schedule+0x46e/0x7d0 > > (XEN) [ 1408.100942] [<ffff82d080164305>] reprogram_timer+0x75/0xe0 > > (XEN) [ 1408.107925] [<ffff82d080136400>] > > timer.c#timer_softirq_action+0x90/0x210 > > (XEN) [ 1408.116263] [<ffff82d08013311c>] > > softirq.c#__do_softirq+0x5c/0x90 > > (XEN) [ 1408.123921] [<ffff82d080167d35>] domain.c#idle_loop+0x25/0x60 > > > > Currently I'm testing the following patch that fixes at least one possible > > scenario: > > > > commit f76dc832c2de4950872fc27962c56c609cff1160 > > Author: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx> > > Date: Tue Feb 14 15:23:54 2017 +0000 > > > > x86/vmx: use curr_vcpu in vmx_vmcs_exit() > > > > During context_switch() from a HVM vCPU to the idle vCPU, current is > > updated but curr_vcpu retains the HMV vCPU's value. If after that, > > for some reason, vmx_vmcs_enter() + vmx_vmcs_exit() pair will be > > executed, the test for has_hvm_container_vcpu(current) will fail and > > vmcs for curr_vcpu will not be loaded. > > I'm not seeing the connection to the watchdog invocation above: > I take it that vmx_vmcs_reload() was caught while waiting for > active_cpu to become -1. Yet that's what vmx_clear_vmcs() > achieves, not vmx_load_vmcs(). Sorry for misunderstanding but these 2 patches are being tested separately, not together. My patch is an alternative approach for the issue. It fixes at least one observed BUG() scenario with PML and I'm currently looking for others. > Similarly, when __context_switch() is being bypassed also on the second > (switch-in) path, VMCS ownership may have been lost and hence needs > re-establishing. Since there's no existing hook to put this in, add a > new one. BTW, and what about vmx_do_resume()? It does vmx_load_vmcs() in cases when VMCS was cleared by someone. > > > This will lead to BUG() during the next __context_switch() when > > vmx_ctxt_switch_from() will be executed and __vmwrite() will fail. > > But that case is specifically (supposed to be) taken care of by > calling vmx_vmcs_reload() from vmx_ctxt_switch_from(). > > > Fix this by testing has_hvm_container_vcpu() against curr_vcpu. > > If at all possible, I like to avoid having to use curr_vcpu in the fixing > of this (whole) issue. > > > --- a/xen/arch/x86/hvm/vmx/vmcs.c > > +++ b/xen/arch/x86/hvm/vmx/vmcs.c > > @@ -742,6 +742,8 @@ void vmx_vmcs_enter(struct vcpu *v) > > void vmx_vmcs_exit(struct vcpu *v) > > { > > struct foreign_vmcs *fv; > > + unsigned int cpu = smp_processor_id(); > > + struct vcpu *p = per_cpu(curr_vcpu, cpu); > > > > if ( likely(v == current) ) > > return; > > @@ -754,8 +756,8 @@ void vmx_vmcs_exit(struct vcpu *v) > > { > > /* Don't confuse vmx_do_resume (for @v or @current!) */ > > vmx_clear_vmcs(v); > > - if ( has_hvm_container_vcpu(current) ) > > - vmx_load_vmcs(current); > > + if ( has_hvm_container_vcpu(p) ) > > + vmx_load_vmcs(p); > > If this change (or something similar) really turned out to be > necessary, the variable declarations should be moved into this > more narrow scope (and I'd prefer the "cpu" one to be dropped > altogether). > > Jan -- Thanks, Sergey _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.