[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init


  • To: Marek Marczykowski <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 23 Mar 2026 12:21:46 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 23 Mar 2026 11:21:50 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.03.2026 16:00, Marek Marczykowski wrote:
> On Wed, Mar 04, 2026 at 03:47:14PM +0100, Jan Beulich wrote:
>> On 04.03.2026 15:36, Marek Marczykowski wrote:
>>> On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
>>>> MCE init for APs was broken when CPU feature re-checking was added. MTRR
>>>> (re)init for the BSP looks to never have been there on the resume path.
>>>>
>>>> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
>>>> Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>> ---
>>>> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
>>>> site of recheck_cpu_features() in enter_state().
>>>
>>> With this patch, I now see the "Thermal monitoring enabled" on resume
>>> also for AP.
>>> And then, the "Temperature above threshold" + "Running in modulated
>>> clock mode" for AP too. But, I don't see matching "Temperature/speed
>>> normal" for any of them...
>>
>> Which would imply that for each CPU you see at most one such message after
>> resume. Can you confirm this? 
> 
> For the current test, yes. I got the messages for CPUs 16, 6, 18, 4, 2 -
> in this order. Not for 0, 8-15 or 20-21. Not sure about CPU0, but for
> others it kinda looks like I got it for P cores, but not E cores? But
> I'm not sure how to reliably distinguish them - I base it on the holes
> in numbering due to smt=off. Specifically I have online CPUs:
> 0,2,4,6,8-16,18,20-21 (yeah, weird ordering...).

I wonder, btw, if this is good enough to translate into a Tested-by: for
this patch. Thoughts?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.