[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 4/5] x86/mwait-idle: disable IBRS during long idle


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 13 Oct 2022 14:03:12 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=slvnrnLNONI7rxqmZwiFBSS2wgx8kaPTR+IWnPIqHYY=; b=KbYuVofjyjqbvtMzSHqewakpUSeCO8Gpgl8E1/y6x6sDT0MlZevqLJ2XYrjUVHT/Wc37IDu0YEOXz9afApiuxOCpciahdsu/1CUBbMBskSfsP6hkuhRiZ+CHiKrrAo9emKzE2z0crBaJ4lVBW9rfAwDUUV4C7QJqRMu2HUQ+bR6gqyR9XYxeTuk1WOJ4x8HdNmKSauZLIV2U9tqdE/ii8LS0VNOa/PKlIlUuqptd+7ydIsta/7rgCQWsZcnRBCYWqgVRC1HpTYXVm3Z0XRv3Ja0tcnWAPAVkeVuC2XxMTOCmsSH0QorMucSiKqJQdHgEcMTmdw7I8uwx3jbkSJBnGA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RBnWdcMYrKnOCKuepe6BlRH+DIBKGqQ2SL7A+8E2h8utoa8ca++qGIJ1qy6a7xmqMbJEm3BQQwcn8Onjf2EHbkfi9lpje+3vT6BhX+tiRGvOIEJyvMQ+ojfPuQ0XaWIUVGdb/XkQOwBJ4wBj/kUVav6QxAfB81sYzmNspwhQrilCn58uv+Bfna0hZsQLN47Gaan72KKncP4V1r7hKGjc5RsNrvQg6yiwMHfbqJoN7lOjKDpz5t3IVlecM79FcZVDgMLYt6no/2DH+qrGc3q4HGerjWIpoNWyPG7ZLC/+OkXAf+eLxftvTm4HkzQA9xfR2YTddFXoY47yjhtZNfuFRw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Thu, 13 Oct 2022 12:03:35 +0000
  • Ironport-data: A9a23:7Utp4a4QEWH9OYbiYwBaQQxRtCDGchMFZxGqfqrLsTDasY5as4F+v mRLXTyAaKvYMWH8c9F0Poni8ktXsZbQnddiG1c9/3o3Hi5G8cbLO4+Ufxz6V8+wwm8vb2o8t plDNYOQRCwQZiWBzvt4GuG59RGQ7YnRGvymTras1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws Jb5rta31GWNglaYCUpJrfPewP9TlK6q4mlB5gVlPakjUGL2zBH5MrpOfcldEFOgKmVkNrbSb /rOyri/4lTY838FYj9yuu+mGqGiaue60Tmm0hK6aYD76vRxjnVaPpIAHOgdcS9qZwChxLid/ jnvWauYEm/FNoWU8AgUvoIx/ytWZcWq85efSZSzXFD6I+QrvBIAzt03ZHzaM7H09c5VPH8X+ uMiGgtTRU+S2N7x2pCgcOlV05FLwMnDZOvzu1lG5BSBV7MKZMuGRK/Ho9hFwD03m8ZCW+7EY NYUYiZuaxKGZABTPlAQC9Q1m+LAanvXKmUE7g7K4/dppTSCpOBy+OGF3N79YNuFSN8Thk+Fj mnH4374ElcRM9n3JT+toijy3beRx3mTtIQ6SrTj/71Igl6p61czEjkdZ0Dim+mlsxvrMz5YA wlOksY0loAw/kG2Stj2XzWjvWWJ+BUbXrJ4A+A8rQ2A1KfQywKYHXQfCC5MbsQ8s807TiBs0 UWG9/v2ARR/vbvTTmiSnop4thu3MCkRaGMHPikNSFNf58G5+N1iyBXSUtxkDai5yMXvHi39y CyLqy54gKgPickM1OOw+lWvby+Qm6UlhzUdvm3/Nl9JJCsgDGJ5T+REMWTm0Ms=
  • Ironport-hdrordr: A9a23:kPSUkqu4Gt3VCaAgvT0kTE2E7skC7YMji2hC6mlwRA09TyXGra 2TdaUgvyMc1gx7ZJhBo7+90We7MBbhHLpOkPEs1NCZLXLbUQqTXfhfBO7ZrwEIdBefygcw79 YCT0E6MqyLMbEYt7eE3ODbKadG/DDvysnB64bjJjVWPGdXgslbnntE422gYylLrWd9dPgE/M 323Ls7m9PsQwVgUu2LQl0+G8TTrdzCk5zrJTYAGh4c8QGLyRel8qTzHRS01goXF2on+8ZrzU H11yjCoomzufCyzRHRk0fV8pRtgdPkjv9OHtaFhMQ5IijlziyoeINicbufuy1dmpDn1H8a1P 335zswNcV67H3cOkmzvBvWwgHllA0j7nfzoGXo9EfLkIjcfnYXGsBBjYVWfl/y8Ew7puxx16 pNwiawq4dXJQmoplWw2/H4EzVR0makq3srluAey1ZFV5EFVbNXpYsDuGtIDZY7Gj7g4oxPKp giMCjl3ocZTbqmVQGZgoE2q+bcHkjbXy32CHTqg/blnAS/xxtCvgglLM92pAZzyHtycegH2w 3+CNUZqFh/dL5pUUtDPpZxfSKWMB24ffueChPkHX3XUIc6Blnql7nbpJ0I2cDCQu168HJ1ou WLbG9l
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Aug 18, 2022 at 03:04:51PM +0200, Jan Beulich wrote:
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> 
> Having IBRS enabled while the SMT sibling is idle unnecessarily slows
> down the running sibling. OTOH, disabling IBRS around idle takes two
> MSR writes, which will increase the idle latency.
> 
> Therefore, only disable IBRS around deeper idle states. Shallow idle
> states are bounded by the tick in duration, since NOHZ is not allowed
> for them by virtue of their short target residency.
> 
> Only do this for mwait-driven idle, since that keeps interrupts disabled
> across idle, which makes disabling IBRS vs IRQ-entry a non-issue.
> 
> Note: C6 is a random threshold, most importantly C1 probably shouldn't
> disable IBRS, benchmarking needed.
> 
> Suggested-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Reviewed-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> bf5835bcdb96
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

One unrelated comment below.

> ---
> v3: New.
> 
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -141,6 +141,12 @@ static const struct cpuidle_state {
>  #define CPUIDLE_FLAG_TLB_FLUSHED     0x10000
>  
>  /*
> + * Disable IBRS across idle (when KERNEL_IBRS), is exclusive vs IRQ_ENABLE
> + * above.
> + */
> +#define CPUIDLE_FLAG_IBRS            0x20000
> +
> +/*
>   * MWAIT takes an 8-bit "hint" in EAX "suggesting"
>   * the C-state (top nibble) and sub-state (bottom nibble)
>   * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc.
> @@ -530,31 +536,31 @@ static struct cpuidle_state __read_mostl
>       },
>       {
>               .name = "C6",
> -             .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> +             .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | 
> CPUIDLE_FLAG_IBRS,
>               .exit_latency = 85,
>               .target_residency = 200,
>       },
>       {
>               .name = "C7s",
> -             .flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED,
> +             .flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED | 
> CPUIDLE_FLAG_IBRS,
>               .exit_latency = 124,
>               .target_residency = 800,
>       },
>       {
>               .name = "C8",
> -             .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
> +             .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED | 
> CPUIDLE_FLAG_IBRS,
>               .exit_latency = 200,
>               .target_residency = 800,
>       },
>       {
>               .name = "C9",
> -             .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
> +             .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED | 
> CPUIDLE_FLAG_IBRS,
>               .exit_latency = 480,
>               .target_residency = 5000,
>       },
>       {
>               .name = "C10",
> -             .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
> +             .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED | 
> CPUIDLE_FLAG_IBRS,
>               .exit_latency = 890,
>               .target_residency = 5000,
>       },
> @@ -576,7 +582,7 @@ static struct cpuidle_state __read_mostl
>       },
>       {
>               .name = "C6",
> -             .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> +             .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | 
> CPUIDLE_FLAG_IBRS,
>               .exit_latency = 133,
>               .target_residency = 600,
>       },
> @@ -906,6 +912,7 @@ static const struct cpuidle_state snr_cs
>  static void cf_check mwait_idle(void)
>  {
>       unsigned int cpu = smp_processor_id();
> +     struct cpu_info *info = get_cpu_info();
>       struct acpi_processor_power *power = processor_powers[cpu];
>       struct acpi_processor_cx *cx = NULL;
>       unsigned int next_state;
> @@ -932,8 +939,6 @@ static void cf_check mwait_idle(void)
>                       pm_idle_save();
>               else
>               {
> -                     struct cpu_info *info = get_cpu_info();
> -
>                       spec_ctrl_enter_idle(info);
>                       safe_halt();
>                       spec_ctrl_exit_idle(info);

Do we need to disable speculation just for the hlt if there's no
C state change?

It would seem to me like the MSR writes could add a lot of latency
when there's no C state change.

Maybe I'm confused.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.