|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8 for-xen-4.5 2/2] dpci: Replace tasklet with an softirq (v8)
>>> On 21.10.14 at 19:19, <konrad.wilk@xxxxxxxxxx> wrote:
> +/*
> + * These two bit states help to safely schedule, deschedule, and wait until
> + * the softirq has finished.
> + *
> + * The semantics behind these two bits is as follow:
> + * - STATE_SCHED - whoever clears it has to ref-count the domain (->dom).
s/clears/modifies/
> + * - STATE_RUN - only the softirq is allowed to set and clear it. If it has
> + * been set the hvm_dirq_assist will RUN with an saved value of the
s/ an / a / (and perhaps also ditch the first "the" on that line,
and the similarly further down)
> + STATE_SCHED, /* Bit 0 */
Bogus comment (effectively re-stating what the language
specification says)?
> +/*
> + * Should only be called from hvm_do_IRQ_dpci. We use the
> + * 'state' as an gate to thwart multiple interrupts being scheduled.
s/ an / a /
> + *
> + * The 'state' is cleared by 'softirq_dpci' when it has
> + * completed executing 'hvm_dirq_assist' or by 'pt_pirq_softirq_reset'
> + * if we want to try to unschedule the softirq before it runs.
> + *
Stray blank comment line.
> +/*
> + * If we are racing with softirq_dpci (state is still set) we return
> + * -ERESTART. Otherwise we return 0.
> + *
> + * If it is -ERESTART, it is the callers responsibility to make sure
> + * that the softirq (with the event_lock dropped) has ran. We need
> + * to flush out the outstanding 'dpci_softirq' (no more of them
> + * will be added for this pirq as the IRQ action handler has been
> + * reset in pt_irq_destroy_bind).
> + */
> +int pt_pirq_softirq_active(struct hvm_pirq_dpci *pirq_dpci)
> +{
> + if ( pirq_dpci->state & (STATE_RUN | STATE_SCHED) )
> + return -ERESTART;
> +
> + /*
> + * If in the future we would call 'raise_softirq_for' right away
> + * after 'pt_pirq_softirq_active' we MUST reset the list (otherwise it
> + * might have stale data).
> + */
> + return 0;
> +}
Having this return -ERESTART and 0 rather than a simple boolean
is kind of odd as long as there are no other errors possible here.
> +static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci)
> +{
> + struct domain *d = pirq_dpci->dom;
> +
> + ASSERT(spin_is_locked(&d->event_lock));
> + /*
> + * The reason it is OK to reset 'dom' when STATE_RUN bit is set is due
> + * to a shortcut the 'dpci_softirq' implements. It stashes the 'dom' in
> + * a local variable before it sets STATE_RUN - and therefore will not
> + * dereference '->dom' which would result in a crash.
> + */
> + if ( test_bit(STATE_RUN, &pirq_dpci->state) )
> + {
> + pirq_dpci->dom = NULL;
> + return;
> + }
> + /*
> + * We are going to try to de-schedule the softirq before it goes in
> + * STATE_RUN. Whoever clears STATE_SCHED MUST refcount the 'dom'.
> + */
> + if ( test_and_clear_bit(STATE_SCHED, &pirq_dpci->state) )
> + {
> + put_domain(d);
> + pirq_dpci->dom = NULL;
> + }
Would it not be easier to follow if instead of the two if()-s you
used switch(cmpxchg(..., STATE_SCHED, 0)) here?
> @@ -128,6 +234,21 @@ int pt_irq_create_bind(
> }
> pirq_dpci = pirq_dpci(info);
> /*
> + * A crude 'while' loop with us dropping the spinlock and giving
> + * the softirq_dpci a chance to run.
> + * We MUST check for this condition as the softirq could be scheduled
> + * and hasn't run yet. We do this up to one second at which point we
> + * give up. Note that this code replaced tasklet_kill which would have
> + * spun forever and would do the same thing (wait to flush out
> + * outstanding hvm_dirq_assist calls.
> + */
Stale comment - there's no 1s timeout here anymore.
> + if ( pt_pirq_softirq_active(pirq_dpci) )
> + {
> + spin_unlock(&d->event_lock);
> + process_pending_softirqs();
ASSERT_NOT_IN_ATOMIC() between these two (the assertion
process_pending_softirqs() does seems too weak for the purposes
here, as we really shouldn't be holding any other spin locks; otoh
it's not really clear to me why that aspect is different from
do_softirq() - just fired off a separate mail)?
> @@ -400,8 +535,13 @@ int pt_irq_destroy_bind(
> msixtbl_pt_unregister(d, pirq);
> if ( pt_irq_need_timer(pirq_dpci->flags) )
> kill_timer(&pirq_dpci->timer);
> - pirq_dpci->dom = NULL;
> pirq_dpci->flags = 0;
> + /*
> + * Before the 'pirq_guest_unbind' had been called an interrupt could
> + * have been scheduled. No more of them are going to be scheduled
> after
> + * that but we must deal with the one that were put in the queue.
> + */
This is the second of third instance of this or a similar comment.
Please have it just once in full, and have the other(s) simply refer
to the full one.
> @@ -652,3 +773,81 @@ void hvm_dpci_eoi(struct domain *d, unsigned int
> guest_gsi,
> unlock:
> spin_unlock(&d->event_lock);
> }
> +
> +static void dpci_softirq(void)
> +{
> + struct hvm_pirq_dpci *pirq_dpci;
> + unsigned int cpu = smp_processor_id();
> + LIST_HEAD(our_list);
> +
> + local_irq_disable();
> + list_splice_init(&per_cpu(dpci_list, cpu), &our_list);
> + local_irq_enable();
> +
> + while ( !list_empty(&our_list) )
> + {
> + struct domain *d;
Considering that you already have one non-function scope variable
here, please use the other one only needed inside this loop
(pirq_dpci) here too.
> +
> + pirq_dpci = list_entry(our_list.next, struct hvm_pirq_dpci,
> softirq_list);
> + list_del(&pirq_dpci->softirq_list);
> +
> + d = pirq_dpci->dom;
> + smp_wmb(); /* 'd' MUST be saved before we set/clear the bits. */
So you state the right thing, but use the wrong barrier: To order a
read and a write, smp_mb() is your only choice.
> -void pci_release_devices(struct domain *d)
> +int pci_release_devices(struct domain *d)
> {
> struct pci_dev *pdev;
> u8 bus, devfn;
> + int ret;
>
> spin_lock(&pcidevs_lock);
> - pci_clean_dpci_irqs(d);
> + ret = pci_clean_dpci_irqs(d);
> + if ( ret == -EAGAIN )
-ERESTART?
> --- a/xen/include/xen/hvm/irq.h
> +++ b/xen/include/xen/hvm/irq.h
> @@ -93,13 +93,13 @@ struct hvm_irq_dpci {
> /* Machine IRQ to guest device/intx mapping. */
> struct hvm_pirq_dpci {
> uint32_t flags;
> - bool_t masked;
> + unsigned long state;
I think "unsigned int" would be sufficient here?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |