[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/XPTI: fix S3 resume (and CPU offlining in general)



Andrew Cooper:
> On 24/05/18 15:35, Simon Gaiser wrote:
>> Andrew Cooper:
>>> On 24/05/18 15:14, Simon Gaiser wrote:
>>>> Jan Beulich:
>>>>>>>> On 24.05.18 at 16:00, <simon@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>> Jan Beulich:
>>>>>>> In commit d1d6fc97d6 ("x86/xpti: really hide almost all of Xen image")
>>>>>>> I've failed to remember the fact that multiple CPUs share a stub
>>>>>>> mapping page. Therefore it is wrong to unconditionally zap the mapping
>>>>>>> when bringing down a CPU; it may only be unmapped when no other online
>>>>>>> CPU uses that same page.
>>>>>>>
>>>>>>> Reported-by: Simon Gaiser <simon@xxxxxxxxxxxxxxxxxxxxxx>
>>>>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>>>>>
>>>>>>> --- a/xen/arch/x86/smpboot.c
>>>>>>> +++ b/xen/arch/x86/smpboot.c
>>>>>>> @@ -876,7 +876,21 @@ static void cleanup_cpu_root_pgt(unsigne
>>>>>>>  
>>>>>>>      free_xen_pagetable(rpt);
>>>>>>>  
>>>>>>> -    /* Also zap the stub mapping for this CPU. */
>>>>>>> +    /*
>>>>>>> +     * Also zap the stub mapping for this CPU, if no other online one 
>>>>>>> uses
>>>>>>> +     * the same page.
>>>>>>> +     */
>>>>>>> +    if ( stub_linear )
>>>>>>> +    {
>>>>>>> +        unsigned int other;
>>>>>>> +
>>>>>>> +        for_each_online_cpu(other)
>>>>>>> +            if ( !((per_cpu(stubs.addr, other) ^ stub_linear) >> 
>>>>>>> PAGE_SHIFT) )
>>>>>>> +            {
>>>>>>> +                stub_linear = 0;
>>>>>>> +                break;
>>>>>>> +            }
>>>>>>> +    }
>>>>>>>      if ( stub_linear )
>>>>>>>      {
>>>>>>>          l3_pgentry_t *l3t = l4e_to_l3e(common_pgt);
>>>>>> Tried this on-top of staging (fc5805daef) and I still get the same
>>>>>> double fault.
>>>>> Hmm, it worked for me offlining (and later re-onlining) several pCPU-s. 
>>>>> What
>>>>> size a system are you testing on? Mine has got only 12 CPUs, i.e. all 
>>>>> stubs
>>>>> are in the same page (and I'd never unmap anything here at all).
>>>> 4 cores + HT, so 8 CPUs from Xen's PoV.
>>> Can you try with the "x86/traps: Dump the instruction stream even for
>>> double faults" patch I've just posted, and show the full #DF panic log
>>> please?  (Its conceivable that there are multiple different issues here.)
>> With Jan's and your patch:
>>
>> (XEN) mce_intel.c:782: MCA Capability: firstbank 0, extended MCE MSR 0, 
>> BCAST, CMCI
>> (XEN) CPU0 CMCI LVT vector (0xf2) already installed
>> (XEN) Finishing wakeup from ACPI S3 state.
>> (XEN) Enabling non-boot CPUs  ...
>> (XEN) emul-priv-op.c:1166:d0v1 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00000
>> (XEN) emul-priv-op.c:1166:d0v1 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00800
>> (XEN) emul-priv-op.c:1166:d0v2 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00000
>> (XEN) emul-priv-op.c:1166:d0v2 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00800
>> (XEN) emul-priv-op.c:1166:d0v3 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00000
>> (XEN) emul-priv-op.c:1166:d0v3 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00800
>> (XEN) emul-priv-op.c:1166:d0v4 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00000
>> (XEN) emul-priv-op.c:1166:d0v4 Domain attempted WRMSR 0000001b from 
>> 0x00000000fee00c00 to 0x00000000fee00800
> 
> /sigh - Naughty Linux.  The PVOps really ought to know that they don't
> have an APIC to play with, not that this related to the crash.
> 
>> (XEN) *** DOUBLE FAULT ***
>> (XEN) ----[ Xen-4.11-rc  x86_64  debug=y   Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e008:[<ffff82d08037b964>] handle_exception+0x9c/0xff
>> (XEN) RFLAGS: 0000000000010006   CONTEXT: hypervisor
>> (XEN) rax: ffffc90040ce40d8   rbx: 0000000000000000   rcx: 0000000000000003
>> (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: 0000000000000000
>> (XEN) rbp: 000036ffbf31bf07   rsp: ffffc90040ce4000   r8:  0000000000000000
>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
>> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: ffffc90040ce7fff
>> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 0000000000042660
>> (XEN) cr3: 000000022200a000   cr2: ffffc90040ce3ff8
>> (XEN) fsb: 00007fa9e7909740   gsb: ffff88021e740000   gss: 0000000000000000
>> (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen code around <ffff82d08037b964> (handle_exception+0x9c/0xff):
>> (XEN)  00 f3 90 0f ae e8 eb f9 <e8> 07 00 00 00 f3 90 0f ae e8 eb f9 83 e9 
>> 01 75
>> (XEN) Current stack base ffffc90040ce0000 differs from expected 
>> ffff8300cec88000
>> (XEN) Valid stack range: ffffc90040ce6000-ffffc90040ce8000, 
>> sp=ffffc90040ce4000, tss.rsp0=ffff8300cec8ffa0
>> (XEN) No stack overflow detected. Skipping stack trace.
> 
> Ok - this is the same as George's crash, and yes - I did misdiagnose the
> stack we were on.  I presume this hardware doesn't have SMAP? (or we've
> expected to take a #DF immediately at the head of the syscall hander.)

Yes, it's too old for SMAP. It's a i7-2760QM.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.