[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/2] x86/mwait-idle: enable interrupts before C1 on Xeons


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 1 Feb 2022 11:46:10 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=e+WMN5BOS0K+xYHBVO/0BL8FhkxwhPDMV5eBp038iCM=; b=DbG/6ZpA/9RPaiEUWltINJlHOxUkuMbmuNw7C7jb9nUzh2gkduMDYSbG87hT6vp92g/wuEqoZlbGp2gn/G8amhRWSnyEQ1lKnAcQaR09nIFn5NEFP76gNpy8h5cBlcVHwaz9L1+6djNKVKjnx8vRf9jqM5SZYBDXoK/z9hsoRu0LP2IILnM9xath27hxAmYkUlgvgEai10RK6dZuTivPlCp5cxSXdiLtjRiXIwJypHKtQiMcHrOhLr5ijLfw12WnK+ssSMxJBuDPpKUyXu61AOoAFAdi+XMB3qLBZl4Gh16grRNukgAH4ITCQHyutNxLltvmfI+V5qrSkSqAkDPbfg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FUpk3FUha+gNtug8o3BdG++LhOsKphvCGgBHc8TBdsXifib2tLZeQCKt+R2OwMkLdUQHBmk/0VjK/oEWq/Zj9W9XumBaPE0wdRqxKBwPOUjAj+tQHs2seedLa9jNF/u5XkOnSdXiW+Kgro+b8Md4Lbhj2VR5TzWv+R5Bw0QHx/bVAi5rmvg6KKlJdmEXw8OXElZaztlYFsIBWT/aP/kJwfL1Ofdz2jNfNlj/JKp5p7SrBQve1GBGCyKviC3Ip/DBBKoUFbKgKLUKpasHqSNf3OWvm3iYm0sNn0mAWV9QkRN/KGbiWVbC/U/Z/Dk1VmgcPMd/Rgbs0eIrOpBZ3go+8A==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 01 Feb 2022 10:46:51 +0000
  • Ironport-data: A9a23:e/M4U60c+6XumFjcovbD5TV2kn2cJEfYwER7XKvMYLTBsI5bpzIFm DQeCmjVa/2PZWr0etF1O4S1p04E7JHUzIJjTwQ5pC1hF35El5HIVI+TRqvS04J+DSFhoGZPt Zh2hgzodZhsJpPkS5PE3oHJ9RGQ74nRLlbHILOCanAZqTNMEn9700o5w7dh2+aEvPDia++zk YKqyyHgEAfNNw5cagr4PIra9XuDFNyr0N8plgRWicJj5TcypFFMZH4rHomjLmOQf2VhNrXSq 9Avbl2O1jixEx8FUrtJm1tgG6EAaua60QOm0hK6V0U+6/TrS+NbPqsTbZIhhUlrZzqhp41A5 tl8io2MUw5yJYn3ybxDQkVKDHQrVUFG0OevzXmXtMWSywvNcmf2wuUoB0YzVWEa0r8pWycUr 6VecW1TKEDY7w616OvTpu1EnMMsIdOtJIoCknph0SvYHbAtRpWrr6Diu4QChGlr3Zkm8fD2S +wpWWdrTjH6SDp9JQlJVZZ9nrzvmSyqG9FfgA3M/vdmi4TJ9yRu1JD9PdyTfcaFLe1Fk0Ddq m/Y8mDRBhABKMfZ2TeD6mirhOLEgWX8Qo16PL+y++NugVaT7ncOExBQXly+ycRVkWbnBYgZc RZNvHNz8+5iryRHU+URQTWA/U+qsi8jSuZKGrYhzDrckvTFvxSGUz1soiF6VPQqs8o/RDoP3 1CPns/0CTEHjIB5WU5x5Z/P82rsZHF9wXsqIHZdEFBbu4WLTJQb00qXJuuPBpJZmTEc9dvY5 zmR5BYziLwI5SLg//XqpAuX695AS3Wgc+LU2uk1dj//hu+aTNT8D2BN1bQ9xawaRGp+ZgLZ1 EXoY+DEsIgz4WilzURhutklErCz/OqiOzbBm1NpFJRJ323zpyX+Ld8IsG8veBYB3iM4ldnBO hW7VeR5v8c7AZdXRfUvP9LZ5zoCkMAM6ugJptiLN4ETM/CdhSeM/T10ZF744oweuBNErE3LA r/CKZzEJS9DUcxPlWPqL89Aj+ND7n1glAv7GMCqpzz6gOH2TCPEFt843K6mM7pRAFWs+luFq r6y9qKiln1ibQEJSnCJqdZNdQxbcilT6FKfg5U/S9Nv6zFOQQkJI/TQ3akga8pimaFUnf3P5 XazRglTz1+XuJENAV/ihqlLZOy9UJBhg2g8OCBwb1+k12J6OdSk7bsFdotxdr4irbQxwflxR vgDWsOBHvUQFWiXp2VDNcHw/N54aRCmpQOSJC75MjIxSIFtGl7S8dj+cwqxqCRXVnirtdEzq qGL3x/ARcZRXBxrCcvbMar9z164sXUHtvh1Wk/EfotadEn2qdA4IC3tlP4nZcoLLEyblDed0 g+XBzYepPXM/NBpoIWY2/jcot7wQeVkH0dcE23K1pqMNHHXrji53ItNcOeUZjSBBmn6z7qvO LdOxPbmPfxZwFsT69hgE6xmxL4V7sf0o+MI1RxtGXjGYgj5Cr5kJXXaj8BDurcUm+1csAqyH EmO5sNbKfOCP8a8SAwdIw8sb+Ki0/AIm2aNsaRpcRuivCInrqCaVUhyPgWXjH0PJbR4B4op3 OM9tZNE8Ae4kBcrbo6Lgy08G75g9ZDcv3HLbq0nPbI=
  • Ironport-hdrordr: A9a23:U9adMKqaWYiSk1EcigSfKFsaV5uzL9V00zEX/kB9WHVpm5Oj+P xGzc526farslsssREb+OxpOMG7MBThHLpOkPMs1NCZLXTbUQqTXfpfBO7ZrQEIdBeOlNK1uZ 0QFpSWTeeAcWSS7vyKkTVQcexQueVvmZrA7Yy1rwYPcegpUdAZ0+4QMHfkLqQcfnghOXNWLu v52iIRzADQBkj/I/7LTUUtbqzmnZnmhZjmaRkJC1oO7xSPtyqh7PrfHwKD1hkTfjtTyfN6mF K13jDR1+GGibWW2xXc32jc49B/n8bg8MJKAIiphtIOIjvhpw60bMBKWqGEvhoyvOazgWxa2u XkklMFBYBe+nnRdma6rV/E3BTh6i8n7zvYxVqRkRLY0LrEbQN/L/AEqZNScxPf5UZllsp7yr h302WQsIcSJQ/cnQzmjuK4GS1Cpw6Rmz4PgOQTh3tQXc81c7lKt7ES+0tTDdMpAD/60oY6C+ NjZfusq8q+SWnqL0wxg1Mfg+BFBh8Ib1W7qwk5y4CoOgFt7TFEJxBy/r1bop8CnKhNPKWsqd 60dpiAr4s+PfP+W5gNcNvpcfHHelAlfii8Ql56AW6XXZ3vaEi946Ie3t0OlZSXkdozvdwPpK g=
  • Ironport-sdr: xN3ZPd38sdnIDHmbhj/rJeJKXEAVbslWYhp6FZLjNJk3PHDetUAgPKXbtm6OwcgqMgeu/ud9A0 PGh+AHRcXSNICS9wjHmWANawvVKqvQoMEUSSz/xqkGfc1nJwESgrwGP8/Vwacti9dgOWK//RiD U+zFadt3HEslYCDU4fvY9ZIPrVIAoQjeDFgAluu302GISMfnIYjkexxw4MvKwT/qZMDOux/3Ki EFuxbCxoxM3ip/LFnf7K3mKSKRc7rs3xPGn+etws1yOg3cPOKh6b/yPJN3DXvcFBealG7XfnnQ kP6oPpDIZ4pPjveI/cpset1/
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Jan 27, 2022 at 04:13:21PM +0100, Jan Beulich wrote:
> From: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
> 
> Enable local interrupts before requesting C1 on the last two generations
> of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
> This decreases average C1 interrupt latency by about 5-10%, as measured
> with the 'wult' tool.
> 
> The '->enter()' function of the driver enters C-states with local
> interrupts disabled by executing the 'monitor' and 'mwait' pair of
> instructions. If an interrupt happens, the CPU exits the C-state and
> continues executing instructions after 'mwait'. It does not jump to
> the interrupt handler, because local interrupts are disabled. The
> cpuidle subsystem enables interrupts a bit later, after doing some
> housekeeping.
> 
> With this patch, we enable local interrupts before requesting C1. In
> this case, if the CPU wakes up because of an interrupt, it will jump
> to the interrupt handler right away. The cpuidle housekeeping will be
> done after the pending interrupt(s) are handled.
> 
> Enabling interrupts before entering a C-state has measurable impact
> for faster C-states, like C1. Deeper, but slower C-states like C6 do
> not really benefit from this sort of change, because their latency is
> a lot higher comparing to the delay added by cpuidle housekeeping.
> 
> This change was also tested with cyclictest and dbench. In case of Ice
> Lake, the average cyclictest latency decreased by 5.1%, and the average
> 'dbench' throughput increased by about 0.8%. Both tests were run for 4
> hours with only C1 enabled (all other idle states, including 'POLL',
> were disabled). CPU frequency was pinned to HFM, and uncore frequency
> was pinned to the maximum value. The other platforms had similar
> single-digit percentage improvements.
> 
> It is worth noting that this patch affects 'cpuidle' statistics a tiny
> bit.  Before this patch, C1 residency did not include the interrupt
> handling time, but with this patch, it will include it. This is similar
> to what happens in case of the 'POLL' state, which also runs with
> interrupts enabled.
> 
> Suggested-by: Len Brown <len.brown@xxxxxxxxx>
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
> [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]
> 
> We don't have a pointer into cpuidle_state_table[] readily available.
> To compensate, propagate the flag into struct acpi_processor_cx.
> 
> Unlike Linux we want to
> - disable IRQs again after MWAITing, as subsequently invoked functions
>   assume so,
> - avoid enabling IRQs if cstate_restore_tsc() is not a no-op, to avoid
>   interfering with, in particular, the time rendezvous.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> ---
> RFC: I'm not entirely certain that we want to take this, i.e. whether
>      we're as much worried about interrupt latency.

I would assume taking this would make it easier for you to pick
further patches.

> RFC: I was going back and forth between putting the local_irq_enable()
>      ahead of or after cpu_is_haltable().
> ---
> v3: Propagate flag to struct acpi_processor_cx. Don't set flag when TSC
>     may stop whild in a C-state.
> v2: New.
> 
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -108,6 +108,11 @@ static const struct cpuidle_state {
>  
>  #define CPUIDLE_FLAG_DISABLED                0x1
>  /*
> + * Enable interrupts before entering the C-state. On some platforms and for
> + * some C-states, this may measurably decrease interrupt latency.
> + */
> +#define CPUIDLE_FLAG_IRQ_ENABLE              0x8000
> +/*
>   * Set this flag for states where the HW flushes the TLB for us
>   * and so we don't need cross-calls to keep it consistent.
>   * If this flag is set, SW flushes the TLB, so even if the
> @@ -539,7 +544,7 @@ static struct cpuidle_state __read_mostl
>  static struct cpuidle_state __read_mostly skx_cstates[] = {
>       {
>               .name = "C1",
> -             .flags = MWAIT2flg(0x00),
> +             .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
>               .exit_latency = 2,
>               .target_residency = 2,
>       },
> @@ -561,7 +566,7 @@ static struct cpuidle_state __read_mostl
>  static const struct cpuidle_state icx_cstates[] = {
>         {
>                 .name = "C1",
> -               .flags = MWAIT2flg(0x00),
> +               .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
>                 .exit_latency = 1,
>                 .target_residency = 1,
>         },
> @@ -842,9 +847,15 @@ static void mwait_idle(void)
>  
>       update_last_cx_stat(power, cx, before);
>  
> -     if (cpu_is_haltable(cpu))
> +     if (cpu_is_haltable(cpu)) {
> +             if (cx->irq_enable_early)
> +                     local_irq_enable();
> +
>               mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
>  
> +             local_irq_disable();
> +     }
> +
>       after = alternative_call(cpuidle_get_tick);
>  
>       cstate_restore_tsc();
> @@ -1335,6 +1346,11 @@ static int mwait_idle_cpu_init(struct no
>               cx->latency = cpuidle_state_table[cstate].exit_latency;
>               cx->target_residency =
>                       cpuidle_state_table[cstate].target_residency;
> +             if ((cpuidle_state_table[cstate].flags &
> +                  CPUIDLE_FLAG_IRQ_ENABLE) &&
> +                 /* cstate_restore_tsc() needs to be a no-op */
> +                 boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
> +                     cx->irq_enable_early = true;
>  
>               dev->count++;
>       }
> --- a/xen/include/xen/cpuidle.h
> +++ b/xen/include/xen/cpuidle.h
> @@ -42,6 +42,7 @@ struct acpi_processor_cx
>      u8 idx;
>      u8 type;         /* ACPI_STATE_Cn */
>      u8 entry_method; /* ACPI_CSTATE_EM_xxx */
> +    bool irq_enable_early;

Should you use a bit field here and limit the field to :1 in
expectation of maybe adding more flags at a later point?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.