[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] xen/arm: fix rank/vgic lock inversion bug

Hi Stefano,

CC Andre to get more feedback.

On 31/01/2017 23:49, Stefano Stabellini wrote:
On Fri, 27 Jan 2017, Julien Grall wrote:
On 03/01/17 23:29, Stefano Stabellini wrote:
Always set the new physical irq affinity at the beginning of
vgic_migrate_irq, in all cases.

No need to set physical irq affinity in gic_update_one_lr anymore,
solving the lock inversion problem.

After migrating an interrupt from vcpu/pcpu 0 to vcpu/pcpu 1, it is
possible to receive a physical interrupt on pcpu 1, which Xen is
supposed to inject into vcpu 1, before the LR on pcpu 0 has been
cleared. In this case the irq is still marked as
GIC_IRQ_GUEST_MIGRATING, and struct pending_irq is still "inflight" on
vcpu 0. As the irq cannot be "inflight" on vcpu 0 and vcpu 1
simultaneously, drop the interrupt.

I am not sure to understand how this is related to the fix of the rank/vgic
lock inversion bug. Am I right to think that this bug was already present?

Yes, it is related: without this patch, this situation cannot happen,
because irq_set_affinity is called from gic_update_one_lr.

I am not convinced about that. gic_update_one_lr will take the lock of the current vCPU whilst vgic_vcpu_inject_irq is taking the lock of the vCPU where the vIRQ has been routed.

The function the memory access of the list_del_init could potentially be re-ordered with irq_set_affinity.

However, what is saving us is the memory barrier hidden in spin_lock call in gicv{2,3}_irq_set_affinity and the other one in vgic_vcpu_inject_irq.

So I agree this is currently working, but I would argue it is by luck.


-        vgic_vcpu_inject_irq(new, irq);

The comment on top of the if says: "If the IRQ is still lr_pending, re-inject
it to the new vCPU". However, you remove interrupt from the list but never
re-inject it.

Looking at the context, I think we should keep the call to
vgic_vcpu_inject_irq otherwise we lost the interrupt for ever.

It looks like I made a mistake while generating the patch: I meant to
remove the redundant irq_set_affinity call, instead of the necessary
vgic_vcpu_inject_irq call. I don't know how it happened, sorry. Thanks
for catching the error.

     /* if the IRQ is in a GICH_LR register, set GIC_IRQ_GUEST_MIGRATING
@@ -495,6 +491,13 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int

+    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
+    {
+        /* Drop the interrupt, because it is still inflight on another vcpu
+        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);

If I understand correctly, the problem arise because LRs are cleared lazily
and the other vCPU is still running. It is a short window and I don't think
should happen often.

I would suggest to kick the other vCPU with an SGI to get the LRs cleared. Any

I am fine with sending an SGI, but we still need
http://marc.info/?l=xen-devel&m=148237295020488 to know the destination
of the SGI.

Hmmm correct.

The problem is what happens next: is it safe to busy-wait until the LR
is cleared?

Technically we already have this problem as LRs are cleared with the vgic lock taken.

So if by mistake you receive the interrupt on the wrong pCPU, you may busy wait on the spinlock for some time.

Note, I am not saying we should do the same here :). I would argue it would depends how much time you expect to be taken for clearing the LRs.

> It is safe only if we cannot receive a second interrupt
notification while the first one has not been deactivated yet by the

- guest vcpu0 reads GICC_IAR for virq0
- guest vcpu1 change target for virq0 to vcpu1
- Xen traps the write and changes the physical affinity of virq0 (pirq0)
  to vcpu1 (pcpu1)
- Xen receives a second virq0 (pirq0) interrupt notification in pcpu1,
  but virq0 is still MIGRATING and guest vcpu0 has not EOIed it yet
- Xen sends an SGI to vcpu0 (pcpu0), but the interrupt has not been
  EOIed yet, thus, the LR doesn't get cleared

Can this happen? I read 3.2.3 and 3.2.4a few times but I am still
unsure, it looks like it can. If it can happen, we cannot busy-wait.

With 3.2.3 and 3.2.4, you are referring to the spec IHI 0048B, right?

I will leave aside PPIs and SGIs as they target one specific CPU and migration will never happen.

So we have to worry about SPIs and LPIs (thought they are not yet supported). From the note in 3.2.3: "For any processor, if an interrupt is active and pending, the GIC does not signal an interrupt exception request
for the interrupt to any processor until the active status is cleared."

Given that SPIs have an activate state and will be shared, this scenario cannot happen.

For LPIs, there is no activate state. So as soon as they are EOIed, they might come up again. Depending on how will we handle irq migration, your scenario will become true. I am not sure if we should take into account LPIs right now.

To be honest, I don't much like the idea of kicking the other vCPU. But I don't have a better idea in order to clear the LRs. I have CCed Andre who has more knowledge on the GIC than me. He may have a clever idea here :).


Julien Grall

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.