[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] dpci: Put the dpci back on the list if scheduled from another CPU.



On Wed, Mar 18, 2015 at 07:38:12AM +0000, Jan Beulich wrote:
> >>> On 17.03.15 at 18:15, <konrad.wilk@xxxxxxxxxx> wrote:
> > On Tue, Mar 17, 2015 at 04:06:14PM +0000, Jan Beulich wrote:
> >> >>> On 17.03.15 at 16:38, <konrad.wilk@xxxxxxxxxx> wrote:
> >> > --- a/xen/drivers/passthrough/io.c
> >> > +++ b/xen/drivers/passthrough/io.c
> >> > @@ -804,7 +804,17 @@ static void dpci_softirq(void)
> >> >          d = pirq_dpci->dom;
> >> >          smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
> >> >          if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
> >> > -            BUG();
> >> > +        {
> >> > +            unsigned long flags;
> >> > +
> >> > +            /* Put back on the list and retry. */
> >> > +            local_irq_save(flags);
> >> > +            list_add_tail(&pirq_dpci->softirq_list, 
> >> > &this_cpu(dpci_list));
> >> > +            local_irq_restore(flags);
> >> > +
> >> > +            raise_softirq(HVM_DPCI_SOFTIRQ);
> >> > +            continue;
> >> > +        }
> >> 
> >> As just said in another mail - unless there are convincing new
> >> arguments in favor of this (more of a hack than a real fix), I'm
> >> not going to accept it and instead consider reverting the
> >> offending commit. Iirc the latest we had come to looked quite a
> >> bit better than this one.
> > 
> > The latest one (please see attached) would cause an dead-lock iff
> > on the CPU we are running the softirq and an do_IRQ comes for the
> > exact dpci we are in process of executing.
> 
> I'm afraid I'm not seeing it - please explain.

Lets assume that the device is an PCIe with MSI. We have only one
VCPU in the guest.

We receive the first interrupt, end up going:
vmx_vmexit_handler
 - case EXIT_REASON_EXTERNAL_INTERRUPT
   \- vmx_do_extint
        \- do_IRQ
          \- __do_IRQ_guest
              \- hvm_do_IRQ_dpci
                 \- raise_softirq_for
                        [DPCI_SOFTIRQ bit set]
vmx_process_softirq
 sti
 do_softirq
   -\ __do_sofitq_
        \- dpci_softirq
            -\ hvm_dirq_assist
               [state is 'running']

[Same vector comes in]
do_IRQ
 \- __do_IRQ_guest
    \- hvm_do_IRQ_dpci
       \- raise_softirq_for
           [here we either
            a) spin waiting 'running' to be done or -- dead-lock
            b) we just exit out and drop this interrupt, 
            c) increment 'masked' to tell 'dpci_softirq' to reschedule
               -- live lock if this keeps on going]

Now c) is protected - the do_IRQ has anti-storm code so eventually
it will stop.
> 
> As to the code - I think switch() is rather hiding the intentions
> here, i.e. the code would be better readable if using two if()s:
> 
> +    for ( ;; )
> +    {
> +        old = cmpxchg(&pirq_dpci->state, 0, 1 << STATE_SCHED);
> +        /* If the 'state' is 0 (not in use) we can schedule it. */
> +        if ( !old )
> +            break;
> +        if ( !(old & (1 << STATE_RUN)) )
> +        {
> +            /* Whenever STATE_SCHED is set we MUST not schedule it. */
> +            assert(old & (1 << STATE_SCHED));
> +            return;
> +        }
> +    }
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.