[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86: extend coverage of HLE "bad page" workaround


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 17 Mar 2023 12:39:26 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2CN/U6peCy/NQ2yC9yvS3dcjZwCd2I8QnDHb1phSuEY=; b=acCswp9nAosH5gHKhLIMfixGVte+p+ZZnM0F8rUtOTuYmKHzAde8IBThIPL4AnylNNIwZKeo4S4fJkuwy6d6GYryCGZ8NKIZYLxhpBajzcUgV0JAdhvUW+WBG7R3VxqpmT3WJ8KOPM46U1xclv81DBjGjO6Sam4AdbWarT6kxAL5I51FwSxjrsGRKAolQ5ryIFNhYGgtvaCJknoOwYKgmodjoJROrfDw4jxz2iItQzlcr/bCj5l/4zcgLZ4/v+qYrW9Rll5H5n7YP5jq+6Azr4rxNqnnn1gILpn93h1CjKiQG7lmvb0OT4689rUeUX6bQ4vJ/h64jyW8GaG3ryNnNQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SqM7/3sSxRmhxG9WA1BKCz7u1/zrKlJVDsxGc7zI4Ggqo/2vBGtk5i0k2BftAmrOgPP8+1USuQpLtQW8FnNGIJE557gidUitkcCFVp7f/lQDAbyoODZWNTbaBhUYe8SRcUQstFAfM3cmSJ9qUvg43y+Va0GibYHFasvNBSZDh93/J+E3DsUVxOWeABpgCXE6nZyVTQfKXPz4WzrV+jrTLCZJvFg6U4ER0/1cpgnJxpBHRte84pVqJjlI+J3ZW6YKm/2x1coTyJzLK0iMvg4Evq1bZU6fzsBwjSV2xpbZEqOZN/SBvgyKaUtNeeB76R+ZnL+ppUzj5AXLabntJNT/wg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 17 Mar 2023 11:40:03 +0000
  • Ironport-data: A9a23:jNEbEa5j0PoZo9w2jTpbfwxRtB/GchMFZxGqfqrLsTDasY5as4F+v mZMXz/SOPiNZWX3ct90ao7j80NXvpLWmoVgGwU4rCs1Hi5G8cbLO4+Ufxz6V8+wwm8vb2o8t plDNYOQRCwQZiWBzvt4GuG59RGQ7YnRGvynTraCYnsrLeNdYH9JoQp5nOIkiZJfj9G8Agec0 fv/uMSaM1K+s9JOGjt8B5mr9VU+7JwehBtC5gZlPasS4weH/5UoJMl3yZ+ZfiOQrrZ8RoZWd 86bpJml82XQ+QsaC9/Nut4XpWVTH9Y+lSDX4pZnc/DKbipq/0Te4Y5iXBYoUm9Fii3hojxE4 I4lWapc6+seFvakdOw1C3G0GszlVEFM0OevzXOX6aR/w6BaGpdFLjoH4EweZOUlFuhL7W5m7 scXOhcWMAu/v9mS7K+1Fe1sgJ15M5y+VG8fkikIITDxK98DGcqGb4CRoNhS0XE3m9xEGuvYa 4wBcz1zYR/cYhpJfFAKFJY5m+TujX76G9FagAvN+exrvC6OkkotidABM/KMEjCObd9SkUuC4 HrP4kzyAw0ANczZwj2Amp6prraXxn2kB9lMTdVU8NY1gWG2xS8OFyYnD1W+j+H+0l7gRIxmf hl8Fi0G6PJaGFaQZtv3UgC8oXWElgUBQNcWGOo/gCmdx6yR7wuHC2wsSj9adMdgpMIwXSYt1 FKCg5XuHzMHmKKRYWKQ8PGTtzzaBMQOBWoLZCtBQQ5e5dDm+Ns3lkiXEo8lF7OphNroHz222 yqNsCU1m7QUi4gMyrm/+lfExTmro/AlUzII2+keZUr9hisRWWJvT9bABYTzhRqYELukcw==
  • Ironport-hdrordr: A9a23:BrBRxqEylTbtXXhopLqE/8eALOsnbusQ8zAXPidKOH9om62j9/ xG+c5xvyMc5wx+ZJheo6HkBEDtex/hHP1OjLX5X43SPjUO0VHARL2KhrGC/9SPIULDH+dmpM NdT5Q=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, May 26, 2020 at 06:40:16PM +0200, Jan Beulich wrote:
> On 26.05.2020 17:01, Andrew Cooper wrote:
> > On 26/05/2020 14:35, Jan Beulich wrote:
> >> On 26.05.2020 13:17, Andrew Cooper wrote:
> >>> On 26/05/2020 07:49, Jan Beulich wrote:
> >>>> Respective Core Gen10 processor lines are affected, too.
> >>>>
> >>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >>>>
> >>>> --- a/xen/arch/x86/mm.c
> >>>> +++ b/xen/arch/x86/mm.c
> >>>> @@ -6045,6 +6045,8 @@ const struct platform_bad_page *__init g
> >>>>      case 0x000506e0: /* errata SKL167 / SKW159 */
> >>>>      case 0x000806e0: /* erratum KBL??? */
> >>>>      case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */
> >>>> +    case 0x000a0650: /* erratum Core Gen10 U/H/S 101 */
> >>>> +    case 0x000a0660: /* erratum Core Gen10 U/H/S 101 */
> >>> This is marred in complexity.
> >>>
> >>> The enumeration of MSR_TSX_CTRL (from the TAA fix, but architectural
> >>> moving forwards on any TSX-enabled CPU) includes a confirmation that HLE
> >>> no longer exists/works.  This applies to IceLake systems, but possibly
> >>> not their initial release configuration (hence, via a later microcode
> >>> update).
> >>>
> >>> HLE is also disabled in microcode on all older parts for errata reasons,
> >>> so in practice it doesn't exist anywhere now.
> >>>
> >>> I think it is safe to drop this workaround, and this does seem a more
> >>> simple option than encoding which microcode turned HLE off (which sadly
> >>> isn't covered by the spec updates, as even when turned off, HLE is still
> >>> functioning according to its spec of "may speed things up, may do
> >>> nothing"), or the interactions with the CPUID hiding capabilities of
> >>> MSR_TSX_CTRL.
> >> I'm afraid I don't fully follow: For one, does what you say imply HLE is
> >> no longer enumerated in CPUID?
> > 
> > No - sadly not.  For reasons of "not repeating the Haswell/Broadwell
> > microcode fiasco", the HLE bit will continue to exist and be set. 
> > (Although on CascadeLake and later, you can turn it off with MSR_TSX_CTRL.)
> > 
> > It was always a weird CPUID bit.  You were supposed to put
> > XACQUIRE/XRELEASE prefixes on your legacy locking, and it would be a nop
> > on old hardware and go faster on newer hardware.
> > 
> > There is nothing runtime code needs to look at the HLE bit for, except
> > perhaps for UI reporting purposes.
> 
> Do you know of some public Intel doc I could reference for all of this,
> which I would kind of need in the description of a patch ...
> 
> >> But then this
> >> erratum does not have the usual text effectively meaning that an ucode
> >> update is or will be available to address the issue; instead it says
> >> that BIOS or VMM can reserve the respective address range.
> > 
> > This is not surprising at all.  Turning off HLE was an unrelated
> > activity, and I bet the link went unnoticed.
> > 
> >> This - assuming the alternative you describe is indeed viable - then is 
> >> surely
> >> a much more intrusive workaround than needed. Which I wouldn't assume
> >> they would suggest in such a case.
> > 
> > My suggestion was to drop the workaround, not to complicated it with a
> > microcode revision matrix.
> 
> ... doing this? I don't think I've seen any of this in writing so far,
> except by you. (I don't understand how this reply of yours relates to
> what I was saying about the spec update. I understand what you are
> suggesting. I merely tried to express that I'd have expected Intel to
> point out the much easier workaround, rather than just a pretty involved
> one.) Otherwise, may I suggest you make such a patch, to make sure it
> has an adequate description?

Seeing as there seems to be some data missing to justify the commit -
was has Linux done with those erratas?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.