[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] x86 guest EOI timer issues / questions
All, Andrew's observation mentioned in "[PATCH v2] x86/smt: Support for enabling/disabling SMT at runtime" and some discussions we've already had with him made me look into how exactly the EOI timer works. I think there are a number of quirks. I'll enumerate all questions I've run into below. At the end of the mail is a patch carrying out some of the adjustments implied by a subset of these questions, plus the addition of some debugging code. It is perhaps worthwhile to note that the printk()-s irq_guest_eoi_timer_fn() both were observed to trigger prior to making the non-debugging code changes. With the patch in its current shape, I've seen trigger (every once in a while) only the printk() added to __do_IRQ_guest(). - Why does the timer not get stopped in desc_guest_eoi() when ->in_flight is zero? - Why does irq_guest_eoi_timer_fn() not re-arm the timer when ->in_flight is still non-zero after all the decrements? Can this case happen at all, considering that the only increment in __do_IRQ_guest() happens under the same desc lock? (If it can happen, I think it would be against the purpose of 37eb6d05fe, as this would mean the IRQ can then remain un-acked for an indefinite period of time, unless something else re-armed the timer.) - Why does irq_guest_eoi_timer_fn() issue ->end() / set_eoi_ready() even when no decrement of any ->in_flight was done at all? Neither for ACKTYPE_UNMASK nor for ACKTYPE_EOI this looks to be fully race free. - Wouldn't __do_IRQ_guest() better stop the timer early on? - Wouldn't __do_IRQ_guest() better avoid re-programming the timer if no ->in_flight increment was done (which I would have thought shouldn't happen anyway, but which I've observed in practice)? Of course this would mean the timer may only be stopped when the first increment occurs. - What about the timer triggering while __do_IRQ_guest() is active (holding the desc lock)? This looks to result in immediate expiry of the EOI deferral for the current interrupt instance (and I've observed this in practice, luckily with no other visible bad effects). - Is it okay for fixup_eoi() to fiddle with action->cpu_eoi_map without holding the desc lock? (I guess being in stop-machine / SMP-boot context makes this acceptable.) Commits of interest: f3922f4084 x86: Support CPU hotplug offline 37eb6d05fe x86: Automatically EOI guest-bound interrupts if guest takes too long Other notes / observations: - irq_guest_eoi_timer_fn() should not need to re-acquire the lock after on_selected_cpus() - irq_guest_eoi_timer_fn() should not need to use irqsave/irqrestore spin lock/unlock (plain irq ones should suffice in a timer handler) - irq_guest_eoi_timer_fn() could ASSERT(action->ack_type != ACKTYPE_NONE) instead of having such a conditional - fixup_eoi() may better avoid setting all peoi[].ready and instead pass a "force" boolean to flush_ready_eoi() Thoughts / opinions appreciated, Jan --- unstable.orig/xen/arch/x86/irq.c +++ unstable/xen/arch/x86/irq.c @@ -1103,7 +1103,7 @@ static void set_eoi_ready(void *data); static void irq_guest_eoi_timer_fn(void *data) { struct irq_desc *desc = data; - unsigned int irq = desc - irq_desc; + unsigned int irq = desc - irq_desc, done = 0; irq_guest_action_t *action; cpumask_t cpu_eoi_map; unsigned long flags; @@ -1115,6 +1115,16 @@ static void irq_guest_eoi_timer_fn(void action = (irq_guest_action_t *)desc->action; + if ( action->eoi_timer.status >= TIMER_STATUS_in_heap ) +{//temp + static unsigned long cnt, thr; + if(++cnt > thr) { + thr |= cnt; + printk("IRQ%u: i=%u a=%d n=%u\n", irq, action->in_flight, action->ack_type, action->nr_guests); + } + goto out; +} + if ( action->ack_type != ACKTYPE_NONE ) { unsigned int i; @@ -1122,11 +1132,29 @@ static void irq_guest_eoi_timer_fn(void { struct domain *d = action->guest[i]; unsigned int pirq = domain_irq_to_pirq(d, irq); + if ( test_and_clear_bool(pirq_info(d, pirq)->masked) ) - action->in_flight--; + ++done; } + action->in_flight -= done; } +{//temp + static unsigned long done_cnt, done_thr; + static unsigned long infl_cnt, infl_thr; + bool log = false; + if(action->in_flight && ++infl_cnt > infl_thr) { + infl_thr |= infl_cnt; + log = true; + } else if(!done && ++done_cnt > done_thr) { + done_thr |= done_cnt; + log = true; + } + if(log) + printk("IRQ%u: d=%u i=%u a=%d n=%u (%06lx,%06lx)\n", irq, done, action->in_flight, + action->ack_type, action->nr_guests, done_cnt, infl_cnt); +} +//todo Instead of the below, assert that ->in_flight is zero and bail if done is zero? if ( action->in_flight != 0 ) goto out; @@ -1156,6 +1184,7 @@ static void __do_IRQ_guest(int irq) int i, sp; struct pending_eoi *peoi = this_cpu(pending_eoi); unsigned int vector = (u8)get_irq_regs()->entry_vector; +unsigned done = 0;//temp if ( unlikely(action->nr_guests == 0) ) { @@ -1167,6 +1196,9 @@ static void __do_IRQ_guest(int irq) return; } + if ( action->ack_type != ACKTYPE_NONE ) + stop_timer(&action->eoi_timer); + if ( action->ack_type == ACKTYPE_EOI ) { sp = pending_eoi_sp(peoi); @@ -1187,6 +1219,7 @@ static void __do_IRQ_guest(int irq) pirq = pirq_info(d, domain_irq_to_pirq(d, irq)); if ( (action->ack_type != ACKTYPE_NONE) && !test_and_set_bool(pirq->masked) ) +++done,//temp action->in_flight++; if ( !is_hvm_domain(d) || !hvm_do_IRQ_dpci(d, pirq) ) send_guest_pirq(d, pirq); @@ -1194,10 +1227,16 @@ static void __do_IRQ_guest(int irq) if ( action->ack_type != ACKTYPE_NONE ) { - stop_timer(&action->eoi_timer); migrate_timer(&action->eoi_timer, smp_processor_id()); set_timer(&action->eoi_timer, NOW() + MILLISECS(1)); } +if(!done) {//temp + static unsigned long cnt, thr; + if(++cnt > thr) { + thr |= cnt; + printk("IRQ%d: d=0 n=%u i=%u a=%d\n", irq, action->nr_guests, action->in_flight, action->ack_type); + } +} } /* @@ -1457,6 +1496,8 @@ void desc_guest_eoi(struct irq_desc *des return; } + stop_timer(&action->eoi_timer); + if ( action->ack_type == ACKTYPE_UNMASK ) { ASSERT(cpumask_empty(action->cpu_eoi_map)); _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |