[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] x86/x2apic: introduce a mixed physical/cluster mode


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Date: Tue, 24 Oct 2023 15:51:50 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LNkMhdC+rCaRlp+Tt4F04Pgskhxq/sWocxlpcspEf1I=; b=WqEDc8VyjqKZwpbgAw0CvZbMGCMh4JV0EA7xZWvmjsBP7ebM8jHCDho4QlOBYdV9ICuGnjkIuj61RDokTdfCE3duyrKq55L43L82GbHsh7gviroXllP8/L6GFuVcB0x8JDhZ8sZWvZ+ihXBiAJYD/Z6XT0zDMAwPA7K+E2dMIFp4nVe5fjGpNGEWK4JgJh6qqceo119mSlspRh5B67oiLp1saKKmIReZJQrOuvD2uyki4cgYI1VShsyLb+RuKgAQyyWa9ECru4/rTpSf+FLuzEwjgikfMr8c4pLS7gxMrwUS9VemBm0L9JwQI0+B4YDZRaXJaBwJcH7q1FSIr6C7KA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FNxMNOir04CEOVKCM1Mo7jo0y2N+qJuPZIiDuZf45Y03fp/UJVonRix+Oi/WBsP2aeAwB3ioxfJ3FRSuCvi9T1fFdlCGQhTx9cQF7zljYbEZw/WPKtd8W1h+wRsSEgDujw+h8RzpXIb6jIYlczMdpxTWWIT9PJkZb9t47wOmHotltXbbRCESLH1P+XxpWpBwzbsMSbWmbBqlQKE47VwWCF1582ANXbTSvfDQKjGp5R1WFLR2JSwWh+2bzcQHgwCiMEiUFlXBiaB5KpJi6j9jruqFOLbgIdoPzCFg8U5Gz6Iipbw9v1250Jkm4sWzKSpe3a9FBUuWajSWg1qN7+Zadg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 24 Oct 2023 13:55:31 +0000
  • Ironport-data: A9a23:i+m58K/qLnSKPoVYPHcDDrUDTn+TJUtcMsCJ2f8bNWPcYEJGY0x3n zcZCjzVOfmMamv2KtF+btjg9UhSvJfTndQxSwZuqyw8E34SpcT7XtnIdU2Y0wF+jCHgZBk+s 5hBMImowOQcFCK0SsKFa+C5xZVE/fjVAOK6UKidYnwZqTZMEE8JkQhkl/MynrlmiN24BxLlk d7pqojUNUTNNwRcawr40Ird7ks01BjOkGlA5AdnPaoX5Aa2e0Q9V/rzG4ngdxMUfaEMdgKKb 76r5K20+Grf4yAsBruN+losWhRXKlJ6FVHmZkt+A8BOsDAbzsAB+v9T2M4nQVVWk120c+VZk 72hg3ASpTABZcUgkMxFO/VR/roX0aduoNcrKlDn2SCfItGvn9IBDJyCAWlvVbD09NqbDkkX/ NJBJAgXKSndhvPrxJ6heOhD1p48eZyD0IM34hmMzBn/JNN/GNXoZPyP4tVVmjAtmspJAPDSI dIDbiZiZwjBZBsJPUoLDJU5n6GjgXyXnz9w8QrJ4/ZopTWNilUvgdABM/KMEjCObd9SkUuC4 HrP4kzyAw0ANczZwj2Amp6prraVw3KhB91PTNVU8NZTh0XL231UEyYkfkWVr8voqUOaB95Qf hl8Fi0G6PJaGFaQZsnwWVi0rWCJujYYWsFMCKsq5QeV0K3W7g2FQG8eQVZpatYrqcs3TjwCz UKSkpXiAjkHmL+ITXOQ8J+EoDX0PjIaRUcZfjMNRwYB59jloakwgwjJQ9IlF7S65vXqHRngz jbMqzIx74j/luYO3qS/uFvA2jSlo8GQShZvv1uGGGW48gl+eYipIZSy7kTW5upBK4DfSUSdu H8DmI6V6+Vm4YyxqRFhid4lRNmBj8tp+hWG6bKzN/HNLwiQxkM=
  • Ironport-hdrordr: A9a23:TwCYD67r6fd9bl0DUQPXwPLXdLJyesId70hD6qkRc203TiX8ra uTdZsguSMc5Ax/ZJhCo7C90cu7L080nKQdieN9AV7IZmjbUQWTXeZfxLqn7zr8GzDvss5xvJ 0QFZSW0eeAaGSSW/yKhDVQuOxQouW6zA==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

The current implementation of x2APIC requires to either use Cluster Logical or
Physical mode for all interrupts.  However the selection of Physical vs Logical
is not done at APIC setup, an APIC can be addressed both in Physical or Logical
destination modes concurrently.

Introduce a new x2APIC mode called mixed, which uses Logical Cluster mode for
IPIs, and Physical mode for external interrupts, thus attempting to use the
best method for each interrupt type.

Using Physical mode for external interrupts allows more vectors to be used, and
interrupt balancing to be more accurate.

Using Logical Cluster mode for IPIs allows less accesses to the ICR register
when sending those, as multiple CPUs can be targeted with a single ICR register
write.

A simple test calling flush_tlb_all() 10000 times in a tight loop on a 96 CPU
box gives the following average figures:

Physical mode: 26617931ns
Mixed mode:    23865337ns

So ~10% improvement versus plain Physical mode.  Note that Xen uses Cluster
mode by default, and hence is already using the fastest way for IPI delivery at
the cost of reducing the amount of vectors available system-wide.

Make the newly introduced mode the default one.

Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
Do we want to keep a full Logical Cluster mode available?  I don't see a reason
to target external interrupts in Logical Cluster mode, but maybe there's
something I'm missing.

It's not clear to me whether the ACPI FADT flags are meant to apply only to
external interrupt delivery mode, or also to IPI delivery.  If
ACPI_FADT_APIC_PHYSICAL is only meant to apply to external interrupts and not
IPIs then we could possibly get rid of physical mode IPI delivery.

Still need to put this under XenServer extensive testing, but wanted to get
some feedback before in case approach greatly changes.

Based on top of 'x86/x2apic: remove usage of ACPI_FADT_APIC_CLUSTER'
---
 docs/misc/xen-command-line.pandoc | 11 ++++
 xen/arch/x86/Kconfig              | 37 ++++++++++---
 xen/arch/x86/genapic/x2apic.c     | 87 ++++++++++++++++++++++---------
 3 files changed, 104 insertions(+), 31 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 6b07d0f3a17f..09c6f17b4b02 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2802,6 +2802,14 @@ the watchdog.
 
 Permit use of x2apic setup for SMP environments.
 
+### x2apic-mode (x86)
+> `= physical | cluster | mixed`
+
+> Default: `physical` if **FADT** mandates physical mode, `mixed` otherwise.
+
+In the case that x2apic is in use, this option switches between modes to
+address APICs in the system as interrupt destinations.
+
 ### x2apic_phys (x86)
 > `= <boolean>`
 
@@ -2812,6 +2820,9 @@ In the case that x2apic is in use, this option switches 
between physical and
 clustered mode.  The default, given no hint from the **FADT**, is cluster
 mode.
 
+**WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`.
+The later takes precedence if both are set.**
+
 ### xenheap_megabytes (arm32)
 > `= <size>`
 
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index eac77573bd75..a44d63b5c8dc 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -228,11 +228,18 @@ config XEN_ALIGN_2M
 
 endchoice
 
-config X2APIC_PHYSICAL
-       bool "x2APIC Physical Destination mode"
-       help
-         Use x2APIC Physical Destination mode by default when available.
+choice
+       prompt "x2APIC Destination mode"
+       default X2APIC_MIXED
+       ---help---
+         Select APIC addressing when x2APIC is enabled.
 
+         The default mode is mixed which should provide the best aspects
+         of both physical and cluster modes.
+
+config X2APIC_PHYSICAL
+       tristate "Physical Destination mode"
+       ---help---
          When using this mode APICs are addressed using the Physical
          Destination mode, which allows using all dynamic vectors on each
          CPU independently.
@@ -242,9 +249,27 @@ config X2APIC_PHYSICAL
          destination inter processor interrupts (IPIs) slightly slower than
          Logical Destination mode.
 
-         The mode when this option is not selected is Logical Destination.
+config X2APIC_CLUSTER
+       tristate "Cluster Destination mode"
+       ---help---
+         When using this mode APICs are addressed using the Cluster Logical
+         Destination mode.
 
-         If unsure, say N.
+         Cluster Destination has the benefit of sending IPIs faster since
+         multiple APICs can be targeted as destinations of a single IPI.
+         However the vector space is shared between all CPUs on the cluster,
+         and hence using this mode reduces the number of available vectors
+         when compared to Physical mode.
+
+config X2APIC_MIXED
+       tristate "Mixed Destination mode"
+       ---help---
+         When using this mode APICs are addressed using the Cluster Logical
+         Destination mode for IPIs and Physical mode for external interrupts.
+
+         Should provide the best of both modes.
+
+endchoice
 
 config GUEST
        bool
diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c
index 707deef98c27..15632cc7332e 100644
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -180,6 +180,25 @@ static const struct genapic __initconstrel 
apic_x2apic_cluster = {
     .send_IPI_self = send_IPI_self_x2apic
 };
 
+/*
+ * Mixed x2APIC mode: use physical for external (device) interrupts, and
+ * cluster for inter processor interrupt.  Such mode has the benefits of not
+ * sharing the vector space with all CPUs on the cluster, while still allows
+ * IPIs to be more efficiently delivered by not having to perform an ICR
+ * write for each target CPU.
+ */
+static const struct genapic __initconstrel apic_x2apic_mixed = {
+    APIC_INIT("x2apic_mixed", NULL),
+    /* NB: int_{delivery,dest}_mode are only used by non-IPI callers. */
+    .int_delivery_mode = dest_Fixed,
+    .int_dest_mode = 0 /* physical delivery */,
+    .init_apic_ldr = init_apic_ldr_x2apic_cluster,
+    .vector_allocation_cpumask = vector_allocation_cpumask_phys,
+    .cpu_mask_to_apicid = cpu_mask_to_apicid_phys,
+    .send_IPI_mask = send_IPI_mask_x2apic_cluster,
+    .send_IPI_self = send_IPI_self_x2apic
+};
+
 static int cf_check update_clusterinfo(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -220,38 +239,56 @@ static struct notifier_block x2apic_cpu_nfb = {
 static int8_t __initdata x2apic_phys = -1;
 boolean_param("x2apic_phys", x2apic_phys);
 
+enum {
+   unset, physical, cluster, mixed
+} static __initdata x2apic_mode = unset;
+
+static int __init parse_x2apic_mode(const char *s)
+{
+    if ( !cmdline_strcmp(s, "physical") )
+        x2apic_mode = physical;
+    else if ( !cmdline_strcmp(s, "cluster") )
+        x2apic_mode = cluster;
+    else if ( !cmdline_strcmp(s, "mixed") )
+        x2apic_mode = mixed;
+    else
+        return EINVAL;
+
+    return 0;
+}
+custom_param("x2apic-mode", parse_x2apic_mode);
+
 const struct genapic *__init apic_x2apic_probe(void)
 {
-    if ( x2apic_phys < 0 )
+    /* x2apic-mode option has preference over x2apic_phys. */
+    if ( x2apic_phys >= 0 && x2apic_mode == unset )
+        x2apic_mode = x2apic_phys ? physical : cluster;
+
+    if ( x2apic_mode == unset )
     {
-        /*
-         * Force physical mode if there's no (full) interrupt remapping 
support:
-         * The ID in clustered mode requires a 32 bit destination field due to
-         * the usage of the high 16 bits to hold the cluster ID.
-         */
-        x2apic_phys = iommu_intremap != iommu_intremap_full ||
-                      (acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL) ||
-                      IS_ENABLED(CONFIG_X2APIC_PHYSICAL);
-    }
-    else if ( !x2apic_phys )
-        switch ( iommu_intremap )
+        if ( acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL )
         {
-        case iommu_intremap_off:
-        case iommu_intremap_restricted:
-            printk("WARNING: x2APIC cluster mode is not supported %s interrupt 
remapping -"
-                   " forcing phys mode\n",
-                   iommu_intremap == iommu_intremap_off ? "without"
-                                                        : "with restricted");
-            x2apic_phys = true;
-            break;
-
-        case iommu_intremap_full:
-            break;
+            printk(XENLOG_INFO "ACPI FADT forcing x2APIC physical mode\n");
+            x2apic_mode = physical;
         }
+        else
+            x2apic_mode = IS_ENABLED(CONFIG_X2APIC_MIXED) ? mixed
+                          : (IS_ENABLED(CONFIG_X2APIC_PHYSICAL) ? physical
+                                                                : cluster);
+    }
 
-    if ( x2apic_phys )
+    if ( x2apic_mode == physical )
         return &apic_x2apic_phys;
 
+    if ( x2apic_mode == cluster && iommu_intremap != iommu_intremap_full )
+    {
+        printk("WARNING: x2APIC cluster mode is not supported %s interrupt 
remapping -"
+               " forcing mixed mode\n",
+               iommu_intremap == iommu_intremap_off ? "without"
+                                                    : "with restricted");
+        x2apic_mode = mixed;
+    }
+
     if ( !this_cpu(cluster_cpus) )
     {
         update_clusterinfo(NULL, CPU_UP_PREPARE,
@@ -260,7 +297,7 @@ const struct genapic *__init apic_x2apic_probe(void)
         register_cpu_notifier(&x2apic_cpu_nfb);
     }
 
-    return &apic_x2apic_cluster;
+    return x2apic_mode == cluster ? &apic_x2apic_cluster : &apic_x2apic_mixed;
 }
 
 void __init check_x2apic_preenabled(void)
-- 
2.42.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.