[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.



>>> On 02.02.15 at 15:29, <konrad.wilk@xxxxxxxxxx> wrote:
> On Tue, Jan 13, 2015 at 10:20:00AM +0000, Jan Beulich wrote:
>> >>> On 12.01.15 at 17:45, <konrad.wilk@xxxxxxxxxx> wrote:
>> > There is race when we clear the STATE_SCHED in the softirq
>> > - which allows the 'raise_softirq_for' to progress and
>> > schedule another dpci. During that time the other CPU could
>> > receive an interrupt and calls 'raise_softirq_for' and put
>> > the dpci on its per-cpu list. There would be two 'dpci_softirq'
>> > running at the same time (on different CPUs) where the
>> > dpci state is STATE_RUN (and STATE_SCHED is cleared). This
>> > ends up hitting:
>> > 
>> >  if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
>> >    BUG()
>> > 
>> > Instead of that put the dpci back on the per-cpu list to deal
>> > with later.
>> > 
>> > The reason we can get his with this is when an interrupt
>> > affinity is set over multiple CPUs.
>> > 
>> > Another potential fix would be to add a guard in the raise_softirq_for
>> > to check for 'STATE_RUN' bit being set and not schedule the
>> > dpci until that bit has been cleared.
>> 
>> I indeed think this should be investigated, because it would make
>> explicit what ...
>> 
>> > --- a/xen/drivers/passthrough/io.c
>> > +++ b/xen/drivers/passthrough/io.c
>> > @@ -804,7 +804,17 @@ static void dpci_softirq(void)
>> >          d = pirq_dpci->dom;
>> >          smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
>> >          if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
>> > -            BUG();
>> > +        {
>> > +            unsigned long flags;
>> > +
>> > +            /* Put back on the list and retry. */
>> > +            local_irq_save(flags);
>> > +            list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
>> > +            local_irq_restore(flags);
>> > +
>> > +            raise_softirq(HVM_DPCI_SOFTIRQ);
>> > +            continue;
>> > +        }
>> 
>> ... this does implicitly - spin until the bad condition cleared.
> 
> I still haven't gotten to trying to reproduce the issues Malcolm
> saw which is easier for me to do (use Intel super-fast storage
> in a Windows guest) - since I've one of those boxes.
> 
> However in lieu of that, here is a patch that does pass my testing
> of SR-IOV, and I believe should work fine.  I _think_ it covers
> what you want it to have Jan, but of course please correct me
> if I made a mistake in the logic.

Since the code quoted above is still there in the new patch, the new
patch can at best be half of what I suggested.

> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -63,10 +63,37 @@ enum {
>  static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
>  {
>      unsigned long flags;
> +    unsigned long old, new, val = pirq_dpci->state;
>  
> -    if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
> -        return;
> +    /*
> +     * This cmpxch is a more clear version of:
> +     * if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->state) )
> +     *         return;
> +     * since it also checks for STATE_RUN conditions.
> +     */
> +    for ( ;; )
> +    {
> +        new = 1 << STATE_SCHED;
> +        if ( val )
> +            new |= val;

Why the if()?

>  
> +        old = cmpxchg(&pirq_dpci->state, val, new);
> +        switch ( old )
> +        {
> +        case (1 << STATE_SCHED):
> +        case (1 << STATE_RUN) | (1 << STATE_SCHED):
> +            /*
> +             * Whenever STATE_SCHED is set we MUST not schedule it.
> +             */
> +            return;
> +        }
> +        /*
> +         * If the 'state' is 0 or (1 << STATE_RUN) we can schedule it.
> +         */

Really? Wasn't the intention to _not_ schedule when STATE_RUN is
set? (Also the above is a only line comment, i.e. wants different style.)

I.e. with what you do now you could as well keep the old code.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.