[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86: extend coverage of HLE "bad page" workaround
On 20/03/2023 9:24 am, Jan Beulich wrote: > On 17.03.2023 12:39, Roger Pau Monné wrote: >> On Tue, May 26, 2020 at 06:40:16PM +0200, Jan Beulich wrote: >>> On 26.05.2020 17:01, Andrew Cooper wrote: >>>> On 26/05/2020 14:35, Jan Beulich wrote: >>>>> On 26.05.2020 13:17, Andrew Cooper wrote: >>>>>> On 26/05/2020 07:49, Jan Beulich wrote: >>>>>>> Respective Core Gen10 processor lines are affected, too. >>>>>>> >>>>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >>>>>>> >>>>>>> --- a/xen/arch/x86/mm.c >>>>>>> +++ b/xen/arch/x86/mm.c >>>>>>> @@ -6045,6 +6045,8 @@ const struct platform_bad_page *__init g >>>>>>> case 0x000506e0: /* errata SKL167 / SKW159 */ >>>>>>> case 0x000806e0: /* erratum KBL??? */ >>>>>>> case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */ >>>>>>> + case 0x000a0650: /* erratum Core Gen10 U/H/S 101 */ >>>>>>> + case 0x000a0660: /* erratum Core Gen10 U/H/S 101 */ >>>>>> This is marred in complexity. >>>>>> >>>>>> The enumeration of MSR_TSX_CTRL (from the TAA fix, but architectural >>>>>> moving forwards on any TSX-enabled CPU) includes a confirmation that HLE >>>>>> no longer exists/works. This applies to IceLake systems, but possibly >>>>>> not their initial release configuration (hence, via a later microcode >>>>>> update). >>>>>> >>>>>> HLE is also disabled in microcode on all older parts for errata reasons, >>>>>> so in practice it doesn't exist anywhere now. >>>>>> >>>>>> I think it is safe to drop this workaround, and this does seem a more >>>>>> simple option than encoding which microcode turned HLE off (which sadly >>>>>> isn't covered by the spec updates, as even when turned off, HLE is still >>>>>> functioning according to its spec of "may speed things up, may do >>>>>> nothing"), or the interactions with the CPUID hiding capabilities of >>>>>> MSR_TSX_CTRL. >>>>> I'm afraid I don't fully follow: For one, does what you say imply HLE is >>>>> no longer enumerated in CPUID? >>>> No - sadly not. For reasons of "not repeating the Haswell/Broadwell >>>> microcode fiasco", the HLE bit will continue to exist and be set. >>>> (Although on CascadeLake and later, you can turn it off with MSR_TSX_CTRL.) >>>> >>>> It was always a weird CPUID bit. You were supposed to put >>>> XACQUIRE/XRELEASE prefixes on your legacy locking, and it would be a nop >>>> on old hardware and go faster on newer hardware. >>>> >>>> There is nothing runtime code needs to look at the HLE bit for, except >>>> perhaps for UI reporting purposes. >>> Do you know of some public Intel doc I could reference for all of this, >>> which I would kind of need in the description of a patch ... >>> >>>>> But then this >>>>> erratum does not have the usual text effectively meaning that an ucode >>>>> update is or will be available to address the issue; instead it says >>>>> that BIOS or VMM can reserve the respective address range. >>>> This is not surprising at all. Turning off HLE was an unrelated >>>> activity, and I bet the link went unnoticed. >>>> >>>>> This - assuming the alternative you describe is indeed viable - then is >>>>> surely >>>>> a much more intrusive workaround than needed. Which I wouldn't assume >>>>> they would suggest in such a case. >>>> My suggestion was to drop the workaround, not to complicated it with a >>>> microcode revision matrix. >>> ... doing this? I don't think I've seen any of this in writing so far, >>> except by you. (I don't understand how this reply of yours relates to >>> what I was saying about the spec update. I understand what you are >>> suggesting. I merely tried to express that I'd have expected Intel to >>> point out the much easier workaround, rather than just a pretty involved >>> one.) Otherwise, may I suggest you make such a patch, to make sure it >>> has an adequate description? >> Seeing as there seems to be some data missing to justify the commit - >> was has Linux done with those erratas? > While they deal with the SNB erratum in a similar way, I'm afraid I'm > unaware of Linux having or having had a workaround for the errata here. > Which, granted, is a little surprising when we did actually even issue > an XSA for this. > > In fact I find Andrew's request even more surprising with that fact (us > having issued XSA-282 for it) in mind, which originally I don't think I > had paid attention to (nor recalled). No - I'm aware of it. It probably was the right move at the time. But, Intel have subsequently killed HLE in microcode updates update in all CPUs it ever existed in (to fix a memory ordering erratum), and removed it from the architecture moving forwards (the enumeration of TSX_CTRL means HLE architecturally doesn't exist even if it is enumerated). These errata no longer exist when you've got up to date microcode, and if you've not got up to date microcode on these CPUs, you've got far worse security problems. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |