[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH] dpci: Put the dpci back on the list if running on another CPU.
On Mon, Jan 12, 2015 at 11:45:30AM -0500, Konrad Rzeszutek Wilk wrote: > There is race when we clear the STATE_SCHED in the softirq > - which allows the 'raise_softirq_for' to progress and > schedule another dpci. During that time the other CPU could > receive an interrupt and calls 'raise_softirq_for' and put > the dpci on its per-cpu list. There would be two 'dpci_softirq' > running at the same time (on different CPUs) where the > dpci state is STATE_RUN (and STATE_SCHED is cleared). This > ends up hitting: > > if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) ) > BUG() > > Instead of that put the dpci back on the per-cpu list to deal > with later. > > The reason we can get his with this is when an interrupt > affinity is set over multiple CPUs. > > Another potential fix would be to add a guard in the raise_softirq_for > to check for 'STATE_RUN' bit being set and not schedule the > dpci until that bit has been cleared. > > Reported-by: Sander Eikelenboom <linux@xxxxxxxxxxxxxx> > Reported-by: Malcolm Crossley <malcolm.crossley@xxxxxxxxxx> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > --- > xen/drivers/passthrough/io.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c > index ae050df..9b77334 100644 > --- a/xen/drivers/passthrough/io.c > +++ b/xen/drivers/passthrough/io.c > @@ -804,7 +804,17 @@ static void dpci_softirq(void) > d = pirq_dpci->dom; > smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */ > if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) ) > - BUG(); > + { > + unsigned long flags; > + > + /* Put back on the list and retry. */ > + local_irq_save(flags); > + list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list)); I chatted with Jan on IRC this, and one worry is that if we add on our per-cpu list an 'dpci' that is running on another CPU - if the other CPU runs 'list_del' it will go BOOM. However, I am not sure if I can come up with a scenario in which this will be triggered - as by the time we get to checking the STATE_RUN condition the list removal has already been done. So adding in on the per-cpu list is OK - and since the STATE_SCHED is set, it guards against double-list addition. CPU1 CPU2 CPU3: softirq_dpci: list_del() from per-cpu-list test-and-set(STATE_RUN) test-and-clear(STATE_SCHED) .. raise_softirq test-and-set(STATE_SCHED) softirq_dpci list_del from per-cpu-list if-test-and-set(STATE_RUN) fails: list_add_tail(..) continue raise_softirq test-and-set(STATE_SCHED) [fails, so no adding] .. softirq_dpci list_del if-test-and-set(STATE_RUN) fails: list_add_tail(..) [OK, we did the list_del] continue.. clear_bit(STATE_RUN) softrqi_dpci list_del() test-and-set(STATE_RUN) test-and-clear(STATE_SCHED) > + local_irq_restore(flags); > + > + raise_softirq(HVM_DPCI_SOFTIRQ); > + continue; > + } > /* > * The one who clears STATE_SCHED MUST refcount the domain. > */ > -- > 2.1.0 > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |