[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: S3 resume issue in xstate_init


  • To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 17 Aug 2021 12:14:36 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vC402O3sHj8QGA3OQNY9BkvNsDudROYF5qFOoOTg+TA=; b=E9BfYJONhQb231ecAHa1WNbwZhYbSwoTRU4yXE4Po/RkB8WedhAkx1FuIYGojJI10D2tQTdkOPEo56dZzpUeUQwTkc7KlSwMIsMJD99XG7Qb8Bf7vUoaVPGAazwE1rTMl9PE1eHkmgjXnY4o4MVcy04pTGi9iodGXdERct0J7ey1M9Ywm+eHnEa9FRk9uGuxkcuGOi/GFQcYUGqwWTCZ4ttOqn/Wety7Rz8SRCvQv4rbc8uMKwflCFgJJFf9wJ+Z0XkRcSnvtc5jFIvwq307zHrxRbVBIG0A7lHG2CQxldLTaFwtIpcqNXTv8LUYVw2MXWpqaWCiDARgVwZAMI5p/Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VHO/Af42fJQudAbbNzHWVgYbuwF1C7kaE8G4PbUYiZ35t8L+HQkSQ4//K+Cw+ZUyhq7DiW6V2kyBgJ3o+9uxANMU+caV94HrSMr0YjukIR9VmHh3zhw7ZS9oYQNJAKmB+hMVyqqf7utqqZCgNf9l2XWkcJFaAztMsGsV+yBS9aauILovjAQRWI9XG2IRf0vp5HscaJegTzh6dCmtkv7fhnjytvfMCRP9IVTPpkdmU7RyV5GScX5ZS8QQWZkpPP+r1PZUQVUKl/mhLDiK14adc/s7CooU4NJuAl5YAelncDen+afIKIUBuiiqkCr8Xk+QcV2CkgYxUf9O2MYtKFbfEw==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Delivery-date: Tue, 17 Aug 2021 11:14:49 +0000
  • Ironport-hdrordr: A9a23:TWTas6g2sWsZFbh+fo+I0IBMJnBQXr0ji2hC6mlwRA09TyX4ra yTdZsgvyMc5AxwZJhCo7+90cu7MBDhHPdOiOF7UItKNzOJhILHFuBfBE/Zsl/dJxE=
  • Ironport-sdr: sftQxQiQN1nzwLvh2yssDaZQjwJyl6gJJLhPiIrq6qL7ESeN2ywCMnMCauoCI80BMPkRkJ7ySm F3JpeAC6y4urDxT/2UNNGcIMZgMfOdbN76dcPKb3Z35rk+slKYLPTl1LmSvyV4hEMJVgiFYfSn 04Cool04nZsMRjDFV+C7rVG2/TAWHJWBfHUXyYGGtstqu6/pPbjD9OuvaLIIwDhh1tgMOAz9vk eiiuxFalLXQLZWh+stHzUk9ks2bYjtB7S/1wMZGsZvg3ScNmuXP3Elp5Ym1Y36dZYXTGK2zkhF AgFRjQpLErCQbs35kLr8Xomg
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17/08/2021 12:02, Marek Marczykowski-Górecki wrote:
> On Tue, Aug 17, 2021 at 03:25:21AM +0200, Marek Marczykowski-Górecki wrote:
>> Hi,
>>
>> I've got another S3 issue:
>>
>> (XEN) Preparing system for ACPI S3 state.
>> (XEN) Disabling non-boot CPUs ...
>> (XEN) Broke affinity for IRQ1, new: ffff
>> (XEN) Broke affinity for IRQ16, new: ffff
>> (XEN) Broke affinity for IRQ9, new: ffff
>> (XEN) Broke affinity for IRQ139, new: ffff
>> (XEN) Broke affinity for IRQ8, new: ffff
>> (XEN) Broke affinity for IRQ14, new: ffff
>> (XEN) Broke affinity for IRQ20, new: ffff
>> (XEN) Broke affinity for IRQ137, new: ffff
>> (XEN) Broke affinity for IRQ138, new: ffff
>> (XEN) Entering ACPI S3 state.
>> (XEN) mce_intel.c:773: MCA Capability: firstbank 0, extended MCE MSR 0, 
>> BCAST, CMCI
>> (XEN) CPU0 CMCI LVT vector (0xf1) already installed
>> (XEN) Finishing wakeup from ACPI S3 state.
>> (XEN) microcode: CPU0 updated from revision 0xca to 0xea, date = 2021-01-05
>> (XEN) xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f
>> (XEN) Enabling non-boot CPUs  ...
>> (XEN) xstate: size: 0x440 (uncompressed 0x240) and states: 0x1f
>> (XEN) Xen BUG at xstate.c:673
>> (XEN) ----[ Xen-4.16-unstable  x86_64  debug=y  Not tainted ]----
>> (XEN) CPU:    1
>> (XEN) RIP:    e008:[<ffff82d040350ee4>] xstate_init+0x24b/0x2ff
>> (XEN) RFLAGS: 0000000000010087   CONTEXT: hypervisor
>> (XEN) rax: 0000000000000240   rbx: 000000000000001f   rcx: 0000000000000440
>> (XEN) rdx: 0000000000000001   rsi: 000000000000000a   rdi: 000000000000001f
>> (XEN) rbp: ffff83025dc9fd38   rsp: ffff83025dc9fd20   r8:  0000000000000001
>> (XEN) r9:  ffff83025dc9fc88   r10: 0000000000000001   r11: 0000000000000001
>> (XEN) r12: ffff83025dc9fd80   r13: 000000000000001f   r14: 0000000000000001
>> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000003526e0
>> (XEN) cr3: 0000000049656000   cr2: 0000000000000000
>> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>> (XEN) Xen code around <ffff82d040350ee4> (xstate_init+0x24b/0x2ff):
>> (XEN)  ff e9 a2 00 00 00 0f 0b <0f> 0b 89 f8 89 f1 0f a2 89 f2 4c 8b 0d cb 
>> b4 0f
>> (XEN) Xen stack trace from rsp=ffff83025dc9fd20:
>> (XEN)    0000000000000240 ffff83025dc9fd80 0000000000000001 ffff83025dc9fd70
>> (XEN)    ffff82d04027e7a1 000000004035a7f1 7ffafbbf01100800 00000000bfebfbff
>> (XEN)    0000000000000001 00000000000000c8 ffff83025dc9feb8 ffff82d0402e43ce
>> (XEN)    000000160a9e0106 bfebfbff80000008 2c1008007ffaf3bf 0000000f00000121
>> (XEN)    00000000029c6fbf 0000000000000100 000000009c002e00 02afcd7f00000000
>> (XEN)    756e654700000000 6c65746e49656e69 65746e4904b21920 726f43202952286c
>> (XEN)    376920294d542865 432048303537382d 322e322040205550 000000007a484730
>> (XEN)    ffff830000000000 ffff83025dc9fe18 00002400402e8e0b 000000085dc9fe30
>> (XEN)    00000002402e9f21 0000000000000001 ffffffff00000000 ffff82d0402e0040
>> (XEN)    00000000003526e0 ffff83025dc9fe68 ffff82d04027bd15 0000000000000001
>> (XEN)    ffff8302590a0000 0000000000000000 00000000000000c8 0000000000000001
>> (XEN)    0000000000000001 ffff83025dc9feb8 ffff82d0402e32b7 0000000000000001
>> (XEN)    0000000000000001 00000000000000c8 0000000000000001 ffff83025dc9fee8
>> (XEN)    ffff82d04030e401 0000000000000001 0000000000000000 0000000000000000
>> (XEN)    0000000000000000 0000000000000000 ffff82d040200122 0800002000000002
>> (XEN)    0100000400010000 0000002000000000 2000000000100000 0000001000000000
>> (XEN)    2000000000000000 0000000029000000 0000008000000000 00110000a0000000
>> (XEN)    8000000080000000 4000000000000008 0000100000000000 0200000040000080
>> (XEN)    0004000000000000 0000010000000002 0400002030000000 0000000060000000
>> (XEN)    0400001000010000 0000000010000000 0000004010000000 0000000000000000
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d040350ee4>] R xstate_init+0x24b/0x2ff
>> (XEN)    [<ffff82d04027e7a1>] F identify_cpu+0x318/0x4af
>> (XEN)    [<ffff82d0402e43ce>] F recheck_cpu_features+0x1f/0x72
>> (XEN)    [<ffff82d04030e401>] F start_secondary+0x255/0x38a
>> (XEN)    [<ffff82d040200122>] F __high_start+0x82/0x91
>> (XEN) 
>> (XEN) 
>> (XEN) ****************************************
>> (XEN) Panic on CPU 1:
>> (XEN) Xen BUG at xstate.c:673
>> (XEN) ****************************************
>> (XEN) 
>> (XEN) Reboot in five seconds...
>>
>> This is with added debug patch:
>>
>> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
>> index 6aaf9a2f1546..7873a21b356a 100644
>> --- a/xen/arch/x86/xstate.c
>> +++ b/xen/arch/x86/xstate.c
>> @@ -668,6 +668,8 @@ void xstate_init(struct cpuinfo_x86 *c)
>>      else
>>      {
>>          BUG_ON(xfeature_mask != feature_mask);
>> +        printk("xstate: size: %#x (uncompressed %#x) and states: 
>> %#"PRIx64"\n",
>> +               xsave_cntxt_size, hw_uncompressed_size(feature_mask), 
>> feature_mask);
>>          BUG_ON(xsave_cntxt_size != hw_uncompressed_size(feature_mask));
>>      }
>>  
>>
>> As can be seen above - the xsave size differs between BSP and other
>> CPU(s) - likely because of (not) loaded ucode update there.
>> I guess it's a matter of moving ucode loading somewhere else, right?
>
> Few more data points:
>
> 1. The CPU is i7-8750H (family 6, model 158, stepping 10).
> 2. I do have "smt=off" on the Xen cmdline, if that matters.

As a datapoint, it would be interesting to confirm what the behaviour is
with SMT enabled.

I'd expect it not to make a difference, because smt=off is a purely Xen
construct and doesn't change the hardware configuration.

>
> I've tried the same without letting Xen load the ucode update (so,
> staying at 0xca) and got the same effect. So, I think it isn't about
> ucode...

Any chance of a full boot log?

This is bizzare.  Looking through start_secondary(), we've got an
ordering error between updating microcode and checking for dropped
features, but again I don't think this would be relevant here.

I suspect this is going to take some more custom debugging logic.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.