[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86: extend coverage of HLE "bad page" workaround


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 20 Mar 2023 10:24:14 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=VqVTmnjHZXKZ7Fq15XdGFcWddqPM6ObnFZpubJkk2ew=; b=RoljEUTjaEmgJjOBx3x4Ce/WVZ+x3zkxxwRBCFc3e9IUnjTrTUpSmPa/XUwPpP1ntVW2sutLe/DIcRtVEsa9uXkJ6dtbT2xUqa353tWRdrsLompzO8O+p9GNEp+3FjXb0FzcsVSKJ1jnnDIEEHfmKUCp17YBRH9jb0PjAsP7ddcywjhPs8dRzkW2sc1j/ufw+1+zGyBjSZFF4Jmj317/sf6o3zOVqiwZ123XOlASx5+3yKxVI6E3tU5qXjqQEL2tVuxhP9NlEMv1ZDIY80mS1w84u8RvGCgMZL9ZLv9H4yDhoHVwAIItHxQLjKJd66K4IQ3QbZRcuvLKdaEiDx16Zw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Dj65fbCYpTn1Van6VgalzVbCEkszEqkZHLxZYdEKxHvmLhXCetdNmSx63hamQGAI7B2RQSez/kfBDEWKVRj4ics1PKr0p0Du27VYOkBnscutrsszwJW6hjs98nshSlzEXIjMQQ0IbXjzevWhjzcd60SAwrhVUbzScJDzoIjvVBBGum/V2Kj6UUaAqgc20TvjT9lhOikrgHMdCN7MWGeZ55W+yDJ5IJ9qZweQxumc7TgpcUz42Zjfldyn9EpEFxEBPGDNdG1XfgWyAOeptYtjgb61HzQlmF1gPZOTDQvaGzgpHDp8b4Ej4eLXOJksKYk0L4Bp5zVKefv+AM4vwVlk1A==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Mon, 20 Mar 2023 09:24:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17.03.2023 12:39, Roger Pau Monné wrote:
> On Tue, May 26, 2020 at 06:40:16PM +0200, Jan Beulich wrote:
>> On 26.05.2020 17:01, Andrew Cooper wrote:
>>> On 26/05/2020 14:35, Jan Beulich wrote:
>>>> On 26.05.2020 13:17, Andrew Cooper wrote:
>>>>> On 26/05/2020 07:49, Jan Beulich wrote:
>>>>>> Respective Core Gen10 processor lines are affected, too.
>>>>>>
>>>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>>>>
>>>>>> --- a/xen/arch/x86/mm.c
>>>>>> +++ b/xen/arch/x86/mm.c
>>>>>> @@ -6045,6 +6045,8 @@ const struct platform_bad_page *__init g
>>>>>>      case 0x000506e0: /* errata SKL167 / SKW159 */
>>>>>>      case 0x000806e0: /* erratum KBL??? */
>>>>>>      case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */
>>>>>> +    case 0x000a0650: /* erratum Core Gen10 U/H/S 101 */
>>>>>> +    case 0x000a0660: /* erratum Core Gen10 U/H/S 101 */
>>>>> This is marred in complexity.
>>>>>
>>>>> The enumeration of MSR_TSX_CTRL (from the TAA fix, but architectural
>>>>> moving forwards on any TSX-enabled CPU) includes a confirmation that HLE
>>>>> no longer exists/works.  This applies to IceLake systems, but possibly
>>>>> not their initial release configuration (hence, via a later microcode
>>>>> update).
>>>>>
>>>>> HLE is also disabled in microcode on all older parts for errata reasons,
>>>>> so in practice it doesn't exist anywhere now.
>>>>>
>>>>> I think it is safe to drop this workaround, and this does seem a more
>>>>> simple option than encoding which microcode turned HLE off (which sadly
>>>>> isn't covered by the spec updates, as even when turned off, HLE is still
>>>>> functioning according to its spec of "may speed things up, may do
>>>>> nothing"), or the interactions with the CPUID hiding capabilities of
>>>>> MSR_TSX_CTRL.
>>>> I'm afraid I don't fully follow: For one, does what you say imply HLE is
>>>> no longer enumerated in CPUID?
>>>
>>> No - sadly not.  For reasons of "not repeating the Haswell/Broadwell
>>> microcode fiasco", the HLE bit will continue to exist and be set. 
>>> (Although on CascadeLake and later, you can turn it off with MSR_TSX_CTRL.)
>>>
>>> It was always a weird CPUID bit.  You were supposed to put
>>> XACQUIRE/XRELEASE prefixes on your legacy locking, and it would be a nop
>>> on old hardware and go faster on newer hardware.
>>>
>>> There is nothing runtime code needs to look at the HLE bit for, except
>>> perhaps for UI reporting purposes.
>>
>> Do you know of some public Intel doc I could reference for all of this,
>> which I would kind of need in the description of a patch ...
>>
>>>> But then this
>>>> erratum does not have the usual text effectively meaning that an ucode
>>>> update is or will be available to address the issue; instead it says
>>>> that BIOS or VMM can reserve the respective address range.
>>>
>>> This is not surprising at all.  Turning off HLE was an unrelated
>>> activity, and I bet the link went unnoticed.
>>>
>>>> This - assuming the alternative you describe is indeed viable - then is 
>>>> surely
>>>> a much more intrusive workaround than needed. Which I wouldn't assume
>>>> they would suggest in such a case.
>>>
>>> My suggestion was to drop the workaround, not to complicated it with a
>>> microcode revision matrix.
>>
>> ... doing this? I don't think I've seen any of this in writing so far,
>> except by you. (I don't understand how this reply of yours relates to
>> what I was saying about the spec update. I understand what you are
>> suggesting. I merely tried to express that I'd have expected Intel to
>> point out the much easier workaround, rather than just a pretty involved
>> one.) Otherwise, may I suggest you make such a patch, to make sure it
>> has an adequate description?
> 
> Seeing as there seems to be some data missing to justify the commit -
> was has Linux done with those erratas?

While they deal with the SNB erratum in a similar way, I'm afraid I'm
unaware of Linux having or having had a workaround for the errata here.
Which, granted, is a little surprising when we did actually even issue
an XSA for this.

In fact I find Andrew's request even more surprising with that fact (us
having issued XSA-282 for it) in mind, which originally I don't think I
had paid attention to (nor recalled).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.