[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 3/3] xen/arm: vgic_migrate_irq: do not race against GIC_IRQ_GUEST_MIGRATING



On Fri, 3 Mar 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 01/03/17 22:15, Stefano Stabellini wrote:
> > A potential race condition occurs when vgic_migrate_irq is called a
> > second time, while GIC_IRQ_GUEST_MIGRATING is already set. In that case,
> > vgic_migrate_irq takes a different vgic lock from gic_update_one_lr.
> 
> Hmmm, vgic_migrate_irq will bail out before accessing inflight list if
> GIC_IRQ_GUEST_MIGRATING is already set:
> 
> /* migrating already in progress, no need to do anything */
> if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status )
>   return;
> 
> And test_bit is atomic. So I don't understand what is the corruption problem
> you mention.

The scenario is a bit convoluted: GIC_IRQ_GUEST_MIGRATING is already set
and vgic_migrate_irq is called to move the irq again, even though the
first migration is not complete yet. This could happen:


      CPU 0                                    CPU 1
  gic_update_one_lr
  test_and_clear_bit MIGRATING
  read target (old)
                                            write target (new)
                                            vgic_migrate_irq
                                              test_bit MIGRATING
                                              irq_set_affinity (new)
                                              return
  irq_set_affinity (old)


After this patch this would happen:

      CPU 0                                    CPU 1
  gic_update_one_lr
  test_bit MIGRATING
  read target (old)
                                            write target (new)
                                            vgic_migrate_irq
                                              test MIGRATING && 
GIC_IRQ_GUEST_VISIBLE (false)
                                              wait until !MIGRATING
  irq_set_affinity (old)
  clear_bit MIGRATING
                                              irq_set_affinity (new)


> > vgic_migrate_irq running concurrently with gic_update_one_lr could cause
> > data corruptions, as they both access the inflight list.
> > 
> > This patch fixes this problem. In vgic_migrate_irq after setting the new
> > vcpu target, it checks both GIC_IRQ_GUEST_MIGRATING and
> > GIC_IRQ_GUEST_VISIBLE. If they are both set we can just return because
> > we have already set the new target: when gic_update_one_lr reaches
> > the GIC_IRQ_GUEST_MIGRATING test, it will do the right thing.
> > 
> > Otherwise, if GIC_IRQ_GUEST_MIGRATING is set but GIC_IRQ_GUEST_VISIBLE
> > is not, gic_update_one_lr is running at the very same time on another
> > pcpu, so it just waits until it completes (GIC_IRQ_GUEST_MIGRATING is
> > cleared).
> > 
> > Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > ---
> >  xen/arch/arm/gic.c  |  5 ++++-
> >  xen/arch/arm/vgic.c | 16 ++++++++++++++--
> >  2 files changed, 18 insertions(+), 3 deletions(-)
> > 
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index 16bb150..a805300 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -508,10 +508,13 @@ static void gic_update_one_lr(struct vcpu *v, int i)
> >               * next pcpu, inflight is already cleared. No concurrent
> >               * accesses to inflight. */
> >              smp_mb();
> > -            if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
> > +            if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
> >              {
> >                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
> >                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
> > +                /* Set the new affinity, then clear MIGRATING. */
> > +                smp_mb();
> > +                clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status);
> >              }
> >          }
> >      }
> > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> > index a323e7e..9141b34 100644
> > --- a/xen/arch/arm/vgic.c
> > +++ b/xen/arch/arm/vgic.c
> > @@ -252,13 +252,25 @@ void vgic_migrate_irq(struct vcpu *old, struct vcpu
> > *new,
> >      spin_lock_irqsave(&old->arch.vgic.lock, flags);
> >      write_atomic(t_vcpu, new->vcpu_id);
> > 
> > -    /* migration already in progress, no need to do anything */
> > -    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
> > +    /* Set the new target, then check MIGRATING and VISIBLE, it pairs
> > +     * with the barrier in gic_update_one_lr. */
> > +    smp_mb();
> > +
> > +    /* no need to do anything, gic_update_one_lr will take care of it */
> > +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) &&
> > +         test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
> >      {
> >          spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
> >          return;
> >      }
> > 
> > +    /* gic_update_one_lr is currently running, wait until its completion */
> > +    while ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
> > +    {
> > +        cpu_relax();
> > +        smp_rmb();
> > +    }
> > +
> >      if ( list_empty(&p->inflight) )
> >      {
> >          irq_set_affinity(p->desc, cpumask_of(new->processor));
> > 
> 
> Cheers,
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.