[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: S3 resume issue in xstate_init


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 17 Aug 2021 15:21:32 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MX5G3Cv1nz/cWB91VAXY3IUlaFculdx0ZOj4SdSbCYE=; b=SSfwKkv2FAwg67NLAjl91vk496a9Mwdn2LQhdTxHighfaXGGQi/LzsxTp/e4g/c8m8Zf3XEODTtaD5Bs4Gn8W4x52dlbwosLSWOZH3FuWLDFvcjsDsi8MiVbbEZ/GOFbUu92J3g50chYtwuiwdKnePSO8x01m1/x5nk3dhge+mqU0sgYiN2zX10PfGGfYXCfcb9EUSGFl00ihJGH1Os/Q6j0+ZdfUnIophUbpBzUXxi/BH+HB9DQ9Oo+AYeLc3Qq9ZbgSQjx03aHsm9jC0oQHr83zUcFdAHoc9i611HFbOqP/2OaTQV4r3/0AEOcpEy3QhRCr+6pMsAJo8IE/peNPg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BHMXNN+qoIiKrVK3qczZSeoGWUlSWaQXdVtzFmVe6uEYEFkXlXVSUSOK3gvVdJZF6ArBqGH0K1YjOMKxk0L4XeyFccqpYVmGKR8Bjd4x58LPisRjujkQJV5460FvtP7BZ6Fi6UxAeKk5wmpUVl+bcFr7vpkeEqUj+YgbYspM780iwZkJ0jNAClkLJ8Go9gSYEf09HvD28dYJ0AkaiELOd+ffkrdK6Qmlso/Kh1qUudwfvQqifdHi6gh0vfJ06dTbtOefpL7MFUWdovF6SKewkcQQzP8piEt3CcVMqsQrRDzE+9DhIMk7Ut92XXjuIQyW1wBH8K/LyE/iDo4RkhY1DQ==
  • Authentication-results: invisiblethingslab.com; dkim=none (message not signed) header.d=none;invisiblethingslab.com; dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 17 Aug 2021 13:21:43 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17.08.2021 15:06, Andrew Cooper wrote:
> On 17/08/2021 13:53, Andrew Cooper wrote:
>> On 17/08/2021 13:21, Jan Beulich wrote:
>>> I'm kind of guessing that set_xcr0() mistakenly skips the actual XCR0
>>> write, due to the cached value matching the to-be-written one, yet
>>> the cache having gone stale across S3.
>> This is a rats nest, and area for concern, but ...
>>
>>>  I think this is to be expected
>>> for previously parked CPUs, as those don't have their per-CPU data
>>> de-allocated (and hence also not re-setup, and thus also not starting
>>> out as zero).
>> ... we don't deallocate regular APs either, so I don't see why the smt=
>> setting would make a difference in this case.
>>
>> To be clear - I think your guess about set_xcr0() skipping the write is
>> correct, because 0x240 is correct for xcr0=X87, but I don't see why smt=
>> makes a difference.

Right - as per my later reply to Marek I don't see an immediate reason
anymore either. I could see an indirect reason via different scheduler
decisions, affecting what ran last on the respective CPUs.

>>>  I guess an easy fix would be to write 0 to
>>> this_cpu(xcr0) directly early in xstate_init(), maybe in an "else"
>>> to the early "if ( bsp )".
>>>
>>> I'm not sure though this would be a good fix longer term, as there
>>> might easily be other similar issues elsewhere. IOW we may need to
>>> see whether per-CPU data initialization wouldn't want changing.
>> We've got other registers too, like MSR_TSC_AUX, but I don't think we
>> want to be doing anything as drastic as changing how the initialisation
>> works.
>>
>> The S3 path needs to explicitly write every register we do lazy context
>> switching of.
> 
> Actually no - that's a dumb suggestion because the APs don't know
> better, and we don't want for_each_cpu() loops running from the BSP.
> 
> Perhaps we want the cpu_down() logic to explicitly invalidate their
> lazily cached values?

I'd rather do this on the cpu_up() path (no point clobbering what may
get further clobbered, and then perhaps not to a value of our liking),
yet then we can really avoid doing this from a notifier and instead do
it early enough in xstate_init() (taking care of XSS at the same time).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.