[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8 for-xen-4.5 2/2] dpci: Replace tasklet with an softirq (v8)
On Fri, Oct 24, 2014 at 04:55:44PM -0400, Konrad Rzeszutek Wilk wrote: > On Fri, Oct 24, 2014 at 11:09:59AM +0100, Jan Beulich wrote: > > >>> On 24.10.14 at 03:58, <konrad.wilk@xxxxxxxxxx> wrote: > > > On Thu, Oct 23, 2014 at 10:36:07AM +0100, Jan Beulich wrote: > > >> >>> On 21.10.14 at 19:19, <konrad.wilk@xxxxxxxxxx> wrote: > > > Was not sure whether you prefer 'true' > > > or 'false' instead of numbers - decided on numbers since most of the > > > code-base > > > uses numbers. > > > > So far we don't use "true" and "false" in hypervisor code at all (or if > > you spotted any such use, it slipped in by mistake). We ought to > > consider switching to bool/true/false though. > > The dec_lzma2 had it. > > > > > +/* > > > + * Reset the pirq_dpci->dom parameter to NULL. > > > + * > > > + * This function checks the different states to make sure it can do > > > + * at the right time and if unschedules the softirq before it has > > > + * run it also refcounts (which is what the softirq would have done). > > > + */ > > > +static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci) > > > +{ > > > + struct domain *d = pirq_dpci->dom; > > > + > > > + ASSERT(spin_is_locked(&d->event_lock)); > > > + switch ( cmpxchg(&pirq_dpci->state, STATE_SCHED, 0) ) > > > + { > > > + /* > > > + * We are going to try to de-schedule the softirq before it goes > > > in > > > + * STATE_RUN. Whoever clears STATE_SCHED MUST refcount the 'dom'. > > > + */ > > > + case STATE_SCHED: > > > + put_domain(d); > > > + /* fallthrough. */ > > > + /* > > > + * The reason it is OK to reset 'dom' when STATE_RUN bit is set > > > is due > > > + * to a shortcut the 'dpci_softirq' implements. It stashes the > > > 'dom' in > > > + * a local variable before it sets STATE_RUN - and therefore > > > will not > > > + * dereference '->dom' which would result in a crash. > > > + */ > > > + case STATE_RUN: > > > + pirq_dpci->dom = NULL; > > > + break; > > > + default: > > > + break; > > > + } > > > > Apart from the indentation being wrong (case labels having the same > > indentation as the switch()'s opening brace), this doesn't seem to be a > > proper equivalent of the former code: There you cleared ->dom when > > STATE_RUN regardless of STATE_SCHED. Leaving out the comments > > I'd suggest > > > > switch ( cmpxchg(&pirq_dpci->state, STATE_SCHED, 0) ) > > { > > case STATE_SCHED: > > put_domain(d); > > case STATE_RUN: case STATE_SCHED|STATE_RUN: > > .. which made me realize that to testing of values as opposed to > bit positions requires ditching the 'enum' and introducing an > STATE_SCHED_BIT, STATE_SCHED, STATE_RUN_BIT, and STATE_RUN_BIT > to complement each other when checking for values or setting bits. > > > pirq_dpci->dom = NULL; > > break; > > default: > > BUG(); > > case 0: > > break; > > } > > > > > --- a/xen/drivers/passthrough/pci.c > > > +++ b/xen/drivers/passthrough/pci.c > > > @@ -767,40 +767,51 @@ static int pci_clean_dpci_irq(struct domain *d, > > > xfree(digl); > > > } > > > > > > - tasklet_kill(&pirq_dpci->tasklet); > > > - > > > - return 0; > > > + return pt_pirq_softirq_active(pirq_dpci); > > > > This function returns a bool_t now, but the (indirect) caller of this > > function expects -ERESTART. > > Fixed up. > > > I added some extra code so that I could reliabily trigger the error paths > and got: > > (XEN) pt_pirq_softirq_active: is 0 (debug: 1) > > This is the first ever usage of pt_pirq_create_bind and for fun the > code returns 'false' > > (XEN) Assertion '!preempt_count()' failed at preempt.c:37 > (XEN) ----[ Xen-4.5.0-rc x86_64 debug=y Not tainted ]---- > (XEN) CPU: 9 > (XEN) RIP: e008:[<ffff82d08011d6db>] ASSERT_NOT_IN_ATOMIC+0x22/0x67 > (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor > (XEN) rax: ffff82d080320d20 rbx: ffff8302fb126470 rcx: 0000000000000001 > (XEN) rdx: 00000032b8d99300 rsi: 000000000000000a rdi: ffff8303149670b8 > (XEN) rbp: ffff8303390a7bd8 rsp: ffff8303390a7bd8 r8: 0000000000000000 > (XEN) r9: 0000000000000000 r10: 00000000fffffffd r11: ffff82d08023e5e0 > (XEN) r12: ffff83025c126700 r13: ffff830314967000 r14: 0000000000000030 > (XEN) r15: ffff83025c126728 cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 000000031b4b7000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff8303390a7bd8: > (XEN) ffff8303390a7c98 ffff82d080149335 ffff8303390a7ca8 ffff82d08016d0e6 > (XEN) 0000000000000001 0000000000000206 00000003c0027c38 ffff8303390a7d88 > (XEN) 00000072390a7c68 ffff830312e5f4d0 0000000000000000 0000001000000003 > (XEN) 0000000000000246 ffff8303390a7c58 ffff82d08012ce66 ffff8303390a7e48 > (XEN) 0000007f390a7c88 ffff8303149670b8 fffffffffffffffd fffffffffffffffd > (XEN) 00007f81af369004 ffff830314967000 ffff8303390a7e38 0000000000000008 > (XEN) ffff8303390a7d78 ffff82d080160131 ffff8303390a7cd8 ffff830302526f10 > (XEN) ffff8303390a7ce8 0000000000000073 ffff830330907324 ffff830330907300 > (XEN) ffff8303390a7d08 ffff82d08016dc79 ffff8303390a7cf8 ffff82d08013b469 > (XEN) ffff8303390a7d28 ffff82d08016e628 ffff8303390a7d28 ffff830314967000 > (XEN) 000000000000007f 0000000000000206 ffff8303390a7dc8 ffff82d0801711dc > (XEN) ffff8303390a7d88 ffff82d08011e203 0000000000000202 fffffffffffffffd > (XEN) 00007f81af369004 ffff830314967000 ffff8800331eb488 0000000000000008 > (XEN) ffff8303390a7ef8 ffff82d0801048e8 ffff830302526f10 ffff83025c126700 > (XEN) ffff8303390a7dc8 ffff830314967000 00000000ffffffff 0000000000000000 > (XEN) ffff8303390a7e98 ffff8303390a7e70 ffff8303390a7e48 ffff82d080184019 > (XEN) ffff8303390a7f18 ffffffff8158b0de ffff8303390a7e98 ffff8303149670b8 > (XEN) ffff83030000007f ffff82d080191105 000000730000f800 ffff8303390a7e74 > (XEN) ffff8300bf52e000 000000000000000d 00007f81af369004 ffff8300bf52e000 > (XEN) 0000000a00000026 0000000000f70002 000000020000007f 00007f8100000002 > (XEN) Xen call trace: > (XEN) [<ffff82d08011d6db>] ASSERT_NOT_IN_ATOMIC+0x22/0x67 > (XEN) [<ffff82d080149335>] pt_irq_create_bind+0xf7/0x5c2 > (XEN) [<ffff82d080160131>] arch_do_domctl+0x1131/0x23e0 > (XEN) [<ffff82d0801048e8>] do_domctl+0x1934/0x1a9c > (XEN) [<ffff82d08022c71b>] syscall_enter+0xeb/0x145 > > which is entirely due to holding the 'domctl_lock_acquire',' > 'rcu_read_lock', and 'pcidevs_lock' lock. > > It seems that the approach of waiting for kicking 'process_pending_softirq' > is not good as other softirq might want to grab any of those locks > at some point and be unhappy. > > Ugh. The original code (as is now) life-cycle was that the tasklet would be initialized the moment the guest tried to set an MSI (or PCIe) interrupt. It would be then cleanup (spin forever waiting for the tasklet to finish) when the domain was being destroyed. So all revolving around 'struct domain'. With these patches we put this life-cycle around each specific PIRQ. And the life-cycle starts when the interrupt is setup and either when the interrupt is disabled or guest is shutdown. However due to the asynchronous nature of this we MUST quisce the softirq before we setup the PIRQ. That is: pt_pirq_destroy_bind -> pt_irq_create_bind [must have softirq stopped]. And that is so that we can do the reset of 'hvm_pirq_dpci' structure properly (in case we can actually clean it up in the error paths of 'pt_pirq_create_bind'). We can stick the 'quisce' part at the end of 'pt_pirq_destroy_bind' or at the start of 'pt_irq_create_bind'. And the only way I can think of doing this quisce is: if ( pt_pirq_softirq_active(pirq_dpci) ) { spin_unlock(&d->event_lock); - ASSERT_NOT_IN_ATOMIC(); - process_pending_softirqs(); + cpu_relax(); goto restart; } Which would replicate what a 'tasklet_kill' does. > > > > > Jan > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |