[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/7] xen/arm: CPU hotplug fixes


On 13/04/18 11:19, Mirela Simonovic wrote:
On Thu, Apr 12, 2018 at 10:43 AM, Julien Grall <julien.grall@xxxxxxx> wrote:

On 11/04/18 17:37, Mirela Simonovic wrote:

Hi Julien,


May I ask you to configure your mail client to use > for quoting and use
plain text? Otherwise, this is going to be really difficult to follow the
discussion after few round (see already below).

On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@xxxxxxx
<mailto:julien.grall@xxxxxxx>> wrote:


     On 11/04/18 16:58, Mirela Simonovic wrote:

         On 04/11/2018 05:07 PM, Julien Grall wrote:

             On 11/04/18 14:19, Mirela Simonovic wrote:

         Migrating interrupts when turning off a CPU already works.
         However, when a CPU is turned back on there is no interrupt
         migration back to the hotplugged CPU - all interrupts will
         remain routed to the CPU#0.
         Patch 7/7 fixes this

     What do you mean by all interrupts? Interrupts routed to guest will
     always follow the vCPU. So are you sure they are going to be
     migrated when that vCPU is paused/off?

Just to make sure we're on the same page - this is about hotplugging
physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is
already implemented and unrelated to this series.

Yes, we are on the same page :). I was just wondering what happen to
interrupt routed to that pCPU.

Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts
that were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.

For example, if a guest is pinned to pCPU#1 an interrupt of a device it
owns will be targeted to pCPU#1.
When pCPU#1 is turned off that interrupt will be migrated to pCPU#0.
pCPU#0 finalizes the suspend and receives wake-up interrupts. However, when
CPU#1 is turned back on that interrupt will remain targeted to the CPU#0,
which I assumed is wrong.
The scenario described here is also how I tested this.

     Can you give the path in Xen doing that?

Sure, here is a backtrace (dumped on the CPU being turned off):
      0  0x2603dc arch_move_irqs(): vgic.c, line 309
      1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
      2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
      3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
      4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
      5  0x201608 take_cpu_down()+52: cpu.c, line 75
      6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
      7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
      8  0x235c80 do_tasklet()+104: tasklet.c, line 126
      9  0x24daec idle_loop()+144: domain.c, line 72
     10  0x25b1f8 start_secondary()+404: smpboot.c, line 368

So this cover interrupt routed to a virtual CPU. However, this does not
handle interrupts used by Xen. How do you handle them?

For instance SMMUs IRQ might be routed to other interrupt than CPU #0.

Interrupts used by Xen should not wake-up the system and will be
disabled when we suspend the devices used by Xen.
Here you only speak about the suspend use case. While I understand your ultimate goal is suspend/resume, this series is about CPU hotplug.

IHMO, the suspend/resume case is no more than a superset of CPU up/down. If you solve the problem for up/down, likely you are going to solve it for suspend/resume.

So, what would happen to interrupts routed to the CPU going offline?

However, I need to double check that such interrupts get enabled on
the right CPU on resume. Could you please tell me which mechanism in
Xen is used to target such an interrupt to a secondary CPU only? Is
that even possible and why would that be used?

SPIs will be routed to the CPU calling setup_irq. It may not always be CPU#0. For instance, this is the case context interrupt for the SMMU because they are setup when the device is assigned.

I guess this decision is arguable. If you move all the interrupts to CPU#0 it will potentially disrupt vCPU running on it. I am thinking in the case of SMMU fault that could be triggered easily by another domain.


Julien Grall

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.