[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/arm: fix rank/vgic locks inversion bug
On Fri, 16 Dec 2016, Julien Grall wrote: > Hi Stefano, > > On 16/12/16 00:08, Stefano Stabellini wrote: > > On Thu, 15 Dec 2016, Julien Grall wrote: > > > On 15/12/2016 01:04, Stefano Stabellini wrote: > > > > The locking order is: first rank lock, then vgic lock. The order is > > > > respected everywhere, except for gic_update_one_lr. > > > > > > > > gic_update_one_lr is called with the vgic lock held, but it calls > > > > vgic_get_target_vcpu, which tries to obtain the rank lock. This can > > > > cause deadlocks. > > > > > > > > We already have a version of vgic_get_target_vcpu that doesn't take the > > > > rank lock: __vgic_get_target_vcpu. Also the only routine that modify > > > > the target vcpu are vgic_store_itargetsr and vgic_store_irouter. They > > > > call vgic_migrate_irq, which already take the vgic lock. > > > > > > > > Solve the lock inversion problem, by not taking the rank lock in > > > > gic_update_one_lr (calling __vgic_get_target_vcpu instead of > > > > vgic_get_target_vcpu). > > > > > > If I look at the callers of gic_update_one_lr, the function gic_clear_lrs > > > will > > > not take the rank lock. So from my understanding nobody will take the rank > > > here. > > > > > > However __vgic_get_target_vcpu has an ASSERT to check whether the rank > > > lock > > > has been taken. So who is taking lock for gic_update_one_lr now? > > > > I should have removed the ASSERT - nobody is taking the rank lock now on > > the gic_update_one_lr code path. > > Please either keep this ASSERT or update it. But not dropping it, we need to > prevent anyone calling this function without any lock taken at least in debug > build. > > The current callers (see vgic_{disable,enable}_irqs) are calling the function > with rank lock taken. > > So we need to keep this behavior at least for them. > > > > We make this safe, by placing modifications to > > > > rank->vcpu within regions protected by the vgic lock. > > > > > > This look unsafe to me. The vgic lock is per-vcpu, but rank->vcpu could be > > > written/read by any vCPU. So you will never protect rank->vcpu with this > > > lock. > > > Did I miss anything? > > > > The vgic lock is per-vcpu, but it is always the vgic lock of the old (old > > in the sense, before the interrupt is migrated) vcpu that is taken. > > > > On one hand any vcpu can read/write to itargetsr. Those operations are > > protected by the rank lock. However vgic_migrate_irq and writes to > > rank->vcpu are protected also by the vgic lock of the old vcpu (first > > the rank lock is taken, then the vgic lock of the old vcpu). > > I don't think this is true. Any vCPU read/write to itargetsr will be protected > by the vgic lock of the vCPU in rank->vcpu[offset]. > > For inflight IRQ, the migration will happen when the guest has EOIed the IRQ. > Until this is happening, the guest may have written multiple time in > ITARGETSR. > > For instance, let say the guest is requesting to migrate the IRQ from vCPU A > to vCPU B. The vgic lock A will be taken in vgic_store_itargetsr, if the > interrupt is inflight vgic_migrate_irq will set a flag. Now the guest could > decide to migrate the interrupt to vCPU C before the interrupt has been EOIed. > In this case, vgic lock B will be taken. > > So we have a mismatch between the lock taken in gic_update_one_lr and > vgic_migrate_irq. > > I think the only safe way to fully protect rank->vcpu is taking the rank lock. > It will never change and be accessible from anywhere. When I submitted this series, I considered this scenario, and I thought that the existing GIC_IRQ_GUEST_MIGRATING flag should be enough to protect it, because when GIC_IRQ_GUEST_MIGRATING is set, vgic_migrate_irq actually returns immediately, without doing anything. Either GIC_IRQ_GUEST_MIGRATING is not set, and it's the first irq migration, or GIC_IRQ_GUEST_MIGRATING is set, and vgic_migrate_irq just returns. However, as I was writing this explanation to you in more details, I realized that it is not actually safe. Concurrent accesses to rank->vcpu, and concurrent calls to irq_set_affinity, can result in the physical affinity actually set to the intermediate pcpu rather than the final. As a matter of fact, one of these races affect the code today, even without this patch series. I considered various possible ways to solve this problem, and the lock inversion problem at once. These are the potential solutions I came up with: 1) We release the vgic lock at the end of gic_update_one_lr, before test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status). Then we take the rank lock. We don't need the per-vcpu vgic lock to execute the following code safely, the rank lock is enough: if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) { struct vcpu *v_target = __vgic_get_target_vcpu(v, irq); irq_set_affinity(p->desc, cpumask_of(v_target->processor)); } This sounds simple in theory, but in practice it is very ugly. We would end up with a function (gic_update_one_lr) that is called with the vgic lock but actually releases the lock before returning. 2) We run gic_update_one_lr and vgic_store_itargetsr in parallel safely and locklessly. There might be a way to do it, but it is not easy I haven't found it yet. 3) Use the vgic lock on the old vcpu to protect read accesses to rank->vcpu. This is the same strategy as implemented in this series. However, to actually work correctly, we also have to store the original vcpu id in struct pending_irq. That way, we know for sure which vgic lock we need to take. Let me premise that I don't like any of these approaches very much. But I think that 1) is the least bad. If that's OK for you, I'll do that. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |