[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8 for-xen-4.5 2/2] dpci: Replace tasklet with an softirq (v8)
>>> On 21.10.14 at 19:19, <konrad.wilk@xxxxxxxxxx> wrote: > +/* > + * These two bit states help to safely schedule, deschedule, and wait until > + * the softirq has finished. > + * > + * The semantics behind these two bits is as follow: > + * - STATE_SCHED - whoever clears it has to ref-count the domain (->dom). s/clears/modifies/ > + * - STATE_RUN - only the softirq is allowed to set and clear it. If it has > + * been set the hvm_dirq_assist will RUN with an saved value of the s/ an / a / (and perhaps also ditch the first "the" on that line, and the similarly further down) > + STATE_SCHED, /* Bit 0 */ Bogus comment (effectively re-stating what the language specification says)? > +/* > + * Should only be called from hvm_do_IRQ_dpci. We use the > + * 'state' as an gate to thwart multiple interrupts being scheduled. s/ an / a / > + * > + * The 'state' is cleared by 'softirq_dpci' when it has > + * completed executing 'hvm_dirq_assist' or by 'pt_pirq_softirq_reset' > + * if we want to try to unschedule the softirq before it runs. > + * Stray blank comment line. > +/* > + * If we are racing with softirq_dpci (state is still set) we return > + * -ERESTART. Otherwise we return 0. > + * > + * If it is -ERESTART, it is the callers responsibility to make sure > + * that the softirq (with the event_lock dropped) has ran. We need > + * to flush out the outstanding 'dpci_softirq' (no more of them > + * will be added for this pirq as the IRQ action handler has been > + * reset in pt_irq_destroy_bind). > + */ > +int pt_pirq_softirq_active(struct hvm_pirq_dpci *pirq_dpci) > +{ > + if ( pirq_dpci->state & (STATE_RUN | STATE_SCHED) ) > + return -ERESTART; > + > + /* > + * If in the future we would call 'raise_softirq_for' right away > + * after 'pt_pirq_softirq_active' we MUST reset the list (otherwise it > + * might have stale data). > + */ > + return 0; > +} Having this return -ERESTART and 0 rather than a simple boolean is kind of odd as long as there are no other errors possible here. > +static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci) > +{ > + struct domain *d = pirq_dpci->dom; > + > + ASSERT(spin_is_locked(&d->event_lock)); > + /* > + * The reason it is OK to reset 'dom' when STATE_RUN bit is set is due > + * to a shortcut the 'dpci_softirq' implements. It stashes the 'dom' in > + * a local variable before it sets STATE_RUN - and therefore will not > + * dereference '->dom' which would result in a crash. > + */ > + if ( test_bit(STATE_RUN, &pirq_dpci->state) ) > + { > + pirq_dpci->dom = NULL; > + return; > + } > + /* > + * We are going to try to de-schedule the softirq before it goes in > + * STATE_RUN. Whoever clears STATE_SCHED MUST refcount the 'dom'. > + */ > + if ( test_and_clear_bit(STATE_SCHED, &pirq_dpci->state) ) > + { > + put_domain(d); > + pirq_dpci->dom = NULL; > + } Would it not be easier to follow if instead of the two if()-s you used switch(cmpxchg(..., STATE_SCHED, 0)) here? > @@ -128,6 +234,21 @@ int pt_irq_create_bind( > } > pirq_dpci = pirq_dpci(info); > /* > + * A crude 'while' loop with us dropping the spinlock and giving > + * the softirq_dpci a chance to run. > + * We MUST check for this condition as the softirq could be scheduled > + * and hasn't run yet. We do this up to one second at which point we > + * give up. Note that this code replaced tasklet_kill which would have > + * spun forever and would do the same thing (wait to flush out > + * outstanding hvm_dirq_assist calls. > + */ Stale comment - there's no 1s timeout here anymore. > + if ( pt_pirq_softirq_active(pirq_dpci) ) > + { > + spin_unlock(&d->event_lock); > + process_pending_softirqs(); ASSERT_NOT_IN_ATOMIC() between these two (the assertion process_pending_softirqs() does seems too weak for the purposes here, as we really shouldn't be holding any other spin locks; otoh it's not really clear to me why that aspect is different from do_softirq() - just fired off a separate mail)? > @@ -400,8 +535,13 @@ int pt_irq_destroy_bind( > msixtbl_pt_unregister(d, pirq); > if ( pt_irq_need_timer(pirq_dpci->flags) ) > kill_timer(&pirq_dpci->timer); > - pirq_dpci->dom = NULL; > pirq_dpci->flags = 0; > + /* > + * Before the 'pirq_guest_unbind' had been called an interrupt could > + * have been scheduled. No more of them are going to be scheduled > after > + * that but we must deal with the one that were put in the queue. > + */ This is the second of third instance of this or a similar comment. Please have it just once in full, and have the other(s) simply refer to the full one. > @@ -652,3 +773,81 @@ void hvm_dpci_eoi(struct domain *d, unsigned int > guest_gsi, > unlock: > spin_unlock(&d->event_lock); > } > + > +static void dpci_softirq(void) > +{ > + struct hvm_pirq_dpci *pirq_dpci; > + unsigned int cpu = smp_processor_id(); > + LIST_HEAD(our_list); > + > + local_irq_disable(); > + list_splice_init(&per_cpu(dpci_list, cpu), &our_list); > + local_irq_enable(); > + > + while ( !list_empty(&our_list) ) > + { > + struct domain *d; Considering that you already have one non-function scope variable here, please use the other one only needed inside this loop (pirq_dpci) here too. > + > + pirq_dpci = list_entry(our_list.next, struct hvm_pirq_dpci, > softirq_list); > + list_del(&pirq_dpci->softirq_list); > + > + d = pirq_dpci->dom; > + smp_wmb(); /* 'd' MUST be saved before we set/clear the bits. */ So you state the right thing, but use the wrong barrier: To order a read and a write, smp_mb() is your only choice. > -void pci_release_devices(struct domain *d) > +int pci_release_devices(struct domain *d) > { > struct pci_dev *pdev; > u8 bus, devfn; > + int ret; > > spin_lock(&pcidevs_lock); > - pci_clean_dpci_irqs(d); > + ret = pci_clean_dpci_irqs(d); > + if ( ret == -EAGAIN ) -ERESTART? > --- a/xen/include/xen/hvm/irq.h > +++ b/xen/include/xen/hvm/irq.h > @@ -93,13 +93,13 @@ struct hvm_irq_dpci { > /* Machine IRQ to guest device/intx mapping. */ > struct hvm_pirq_dpci { > uint32_t flags; > - bool_t masked; > + unsigned long state; I think "unsigned int" would be sufficient here? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |