[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 2/6] arm/irq: Migrate IRQs during CPU up/down operations





On 3/30/26 14:59, Mykyta Poturai wrote:

Hello Mykyta


Move IRQs from dying CPU to the online ones when a CPU is getting
offlined. When onlining, rebalance all IRQs in a round-robin fashion.
Guest-bound IRQs are already handled by scheduler in the process of
moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen
itself.

Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>
---
v6->v7:
* replace ifdef with IS_ENABLED

v5->v6:
* don't do any balancing on boot
* only do balancing when cpu hotplug is enabled

v4->v5:
* handle CPU onlining as well
* more comments
* fix crash when ESPI is disabled
* don't assume CPU 0 is a boot CPU
* use insigned int for irq number
* remove assumption that all irqs a bound to CPU 0 by default from the
   commit message

v3->v4:
* patch introduced
---
  xen/arch/arm/include/asm/irq.h |  6 ++++
  xen/arch/arm/irq.c             | 59 ++++++++++++++++++++++++++++++++++
  xen/arch/arm/smpboot.c         |  7 ++++
  3 files changed, 72 insertions(+)

diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h
index 09788dbfeb..3ed55e02c3 100644
--- a/xen/arch/arm/include/asm/irq.h
+++ b/xen/arch/arm/include/asm/irq.h
@@ -126,6 +126,12 @@ bool irq_type_set_by_domain(const struct domain *d);
  void irq_end_none(struct irq_desc *irq);
  #define irq_end_none irq_end_none
+#ifdef CONFIG_CPU_HOTPLUG
+void rebalance_irqs(unsigned int from, bool up);
+#else
+static inline void rebalance_irqs(unsigned int from, bool up) {}
+#endif
+
  #endif /* _ASM_HW_IRQ_H */
  /*
   * Local variables:
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 7204bc2b68..447bee428e 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -158,6 +158,60 @@ static int init_local_irq_data(unsigned int cpu)
      return 0;
  }
+#ifdef CONFIG_CPU_HOTPLUG
+static int cpu_next;
+
+static void balance_irq(int irq, unsigned int from, bool up)
+{
+    struct irq_desc *desc = irq_to_desc(irq);
+    unsigned long flags;
+
+    ASSERT(!cpumask_empty(&cpu_online_map));
+
+    spin_lock_irqsave(&desc->lock, flags);
+    if ( likely(!desc->action) )
+        goto out;
+
+    if ( likely(test_bit(_IRQ_GUEST, &desc->status) ||
+                test_bit(_IRQ_MOVE_PENDING, &desc->status)) )
+        goto out;
+
+    /*
+     * Setting affinity to a mask of multiple CPUs causes the GIC drivers to
+     * select one CPU from that mask. If the dying CPU was included in the 
IRQ's
+     * affinity mask, we cannot determine exactly which CPU the interrupt is
+     * currently routed to, as GIC drivers lack a concrete get_affinity API. So
+     * to be safe we must reroute it to a new, definitely online, CPU. In the
+     * case of CPU going down, we move only the interrupt that could reside on
+     * it. Otherwise, we rearrange all interrupts in a round-robin fashion.
+     */
+    if ( !up && !cpumask_test_cpu(from, desc->affinity) )
+        goto out;
+
+    cpu_next = cpumask_cycle(cpu_next, &cpu_online_map);
+    irq_set_affinity(desc, cpumask_of(cpu_next));
+
+out:
+    spin_unlock_irqrestore(&desc->lock, flags);
+}
+
+void rebalance_irqs(unsigned int from, bool up)
+{
+    int irq;
+
+    if ( cpumask_empty(&cpu_online_map) )
+        return;
+
+    for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ )
+        balance_irq(irq, from, up);
+
+#ifdef CONFIG_GICV3_ESPI
+    for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ )
+        balance_irq(irq, from, up);

I think, here we have an inefficient iteration over the ESPI range.
Even when CONFIG_GICV3_ESPI=y, the GIC HW might not support ESPIs at runtime.

We should probably check if they are present before entering the loop so we do not waste cycles on 1024 unnecessary NULL lookups. Could we use something like ESPI_BASE_INTID + gic_number_espis() to set the loop boundary? What do you think?

+#endif
+}
+#endif /* CONFIG_CPU_HOTPLUG */
+
  static int cpu_callback(struct notifier_block *nfb, unsigned long action,
                          void *hcpu)
  {
@@ -172,6 +226,11 @@ static int cpu_callback(struct notifier_block *nfb, 
unsigned long action,
              printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
                     cpu);
          break;
+    case CPU_ONLINE:
+        if ( IS_ENABLED(CONFIG_CPU_HOTPLUG) &&
+             system_state >= SYS_STATE_active )
+            rebalance_irqs(cpu, true);
+        break;
      }
return notifier_from_errno(rc);
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 7f3cfa812e..7d877179c0 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -425,6 +425,13 @@ void __cpu_disable(void)
smp_mb(); + /*
+     * Now that the interrupts are cleared and the CPU marked as offline,
+     * move interrupts out of it
+     */
+    if ( IS_ENABLED(CONFIG_CPU_HOTPLUG) )
+        rebalance_irqs(cpu, false);
+
      /* Return to caller; eventually the IPI mechanism will unwind and the
       * scheduler will drop to the idle loop, which will call stop_cpu(). */
  }




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.