[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch v5 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts



>>> On 14.11.13 at 17:01, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> The new logic is as follows:
>  * A single high priority vector is allocated and uses on all cpus.

Does this really need to be a high priority one? I'd think we'd be
fine with the lowest priority one we can get, as we only need the
wakeup here if nothing else gets a CPU to wake up.

> +/* Wake up all cpus in the channel mask.  Lock should be held. */
> +static void hpet_wake_cpus(struct hpet_event_channel *ch)
>  {
> -    unsigned int cpu = smp_processor_id();
> +    cpuidle_wakeup_mwait(ch->cpumask);
>  
> -    if ( cpumask_test_and_clear_cpu(cpu, mask) )
> -        raise_softirq(TIMER_SOFTIRQ);
> -
> -    cpuidle_wakeup_mwait(mask);
> -
> -    if ( !cpumask_empty(mask) )
> -       cpumask_raise_softirq(mask, TIMER_SOFTIRQ);
> +    if ( !cpumask_empty(ch->cpumask) )
> +       cpumask_raise_softirq(ch->cpumask, TIMER_SOFTIRQ);

Looks like the cpumask_empty() check isn't really needed here?

> +/* HPET interrupt entry.  This is set up as a high priority vector. */
> +static void do_hpet_irq(struct cpu_user_regs *regs)
>  {
> -    struct hpet_event_channel *ch = (struct hpet_event_channel *)data;
> -
> -    this_cpu(irq_count)--;
> +    unsigned int cpu = smp_processor_id();

This is being used just once, and hence rather pointless to have.

> -    desc->handler = &hpet_msi_type;
> -    ret = request_irq(ch->msi.irq, hpet_interrupt_handler, "HPET", ch);
> -    if ( ret >= 0 )
> -        ret = __hpet_setup_msi_irq(desc);
> +    ret = hpet_setup_msi(ch);

Why did you keep this? With that function now being called from
hpet_broadcast_enter() I don't why you'd need this here (just
like you're removing it from hpet_broadcast_resume() without
replacement).

> @@ -574,6 +384,7 @@ void __init hpet_broadcast_init(void)
>          /* Stop HPET legacy interrupts */
>          cfg &= ~HPET_CFG_LEGACY;
>          n = num_hpets_used;
> +        free_channels = (1U << n) - 1;

This is undefined when n == 32. Since n > 0, I'd suggest

        free_channels = (u32)~0 >> (32 - n);

> +    ch = hpet_get_free_channel();
> +
> +    if ( ch )
> +    {
> +        spin_lock(&ch->lock);
> +
> +        /* This really should be an MSI channel by this point */
> +        ASSERT(!(ch->flags & HPET_EVT_LEGACY));
> +
> +        hpet_msi_mask(ch);
> +
> +        this_cpu(hpet_channel) = ch;
> +        ch->cpu = cpu;

I think even if benign, you should set ->cpu before storing into
hpet_channel.

> +        cpumask_set_cpu(cpu, ch->cpumask);
> +
> +        hpet_setup_msi(ch);
> +        hpet_program_time(ch, deadline, NOW(), 1);
> +        hpet_msi_unmask(ch);
> +
> +        spin_unlock(&ch->lock);
> +    }
> +    else
> +    {
> +        /* TODO - this seems very ugly */

And you nevertheless want it committed this way?

> +        s_time_t fallback_deadline = STIME_MAX;
> +        unsigned int i, fallback_idx = -1;
> +
> +        for ( i = 0; i < num_hpets_used; ++i )
> +        {
> +            ch = &hpet_events[i];
> +            spin_lock(&ch->lock);
> +
> +            if ( ch->cpu == -1 )
> +                goto continue_search;
> +
> +            /* This channel is going to expire far too early */
> +            if ( ch->next_event < deadline - MICROSECS(50) )
> +                goto continue_search;

So this is going to make us wake early. You remember that all this
code exists solely for power saving purposes? Iiuc the CPU will
then basically spin entering an idle state, until its actual wakeup
time is reached.

> +
> +            /* We can deal with being woken with 50us to spare */
> +            if ( ch->next_event <= deadline )
> +                break;
> +
> +            /* Otherwise record our best HPET to borrow. */
> +            if ( ch->next_event <= fallback_deadline )
> +            {
> +                fallback_idx = i;
> +                fallback_deadline = ch->next_event;
> +            }
> +
> +        continue_search:
> +            spin_unlock(&ch->lock);
> +            ch = NULL;
> +        }
> +
> +        if ( ch )
> +        {
> +            /* Found HPET with an appropriate time.  Request to be woken up 
> */
> +            cpumask_set_cpu(cpu, ch->cpumask);
> +            this_cpu(hpet_channel) = ch;
> +            spin_unlock(&ch->lock);
> +        }
> +        else if ( fallback_deadline < STIME_MAX && fallback_deadline != -1 )
> +        {
> +            /*
> +             * Else we want to reprogram the fallback HPET sooner if 
> possible,
> +             * but with all the spinlocking, it just might have passed.
> +             */
> +            ch = &hpet_events[fallback_idx];
> +
> +            spin_lock(&ch->lock);
>  
> -    if ( !(ch->flags & HPET_EVT_LEGACY) )
> -        hpet_attach_channel(cpu, ch);
> +            if ( ch->cpu != -1 && ch->next_event == fallback_deadline )

Can't this be "<= deadline", being quite a bit more flexible?

> +            {
> +                cpumask_set_cpu(cpu, ch->cpumask);
> +                hpet_program_time(ch, deadline, NOW(), 1);
> +            }
> +            else
> +                /* All else has failed.  Wander the idle loop again */
> +                this_cpu(timer_deadline) = NOW() - 1;
> +
> +            spin_unlock(&ch->lock);
> +        }
> +        else
> +            /* All HPETs were too soon.  Wander the idle loop again */
> +            this_cpu(timer_deadline) = NOW() - 1;

Here and above - what good will this do? Is this just in the
expectation that the next time round a free channel will likely be
found? If so, why can't you just go back to the start of the
function.

Further - how do you see this going back to the idle loop? The
way this is called from acpi/cpu_idle.c, you'll end up entering
C3 anyway, with nothing going to wake you up. By proceeding to
the end of the function, you even manage to disable the LAPIC
timer.

> +    /* If we own the channel, detach it */
> +    if ( ch->cpu == cpu )
> +    {
> +        hpet_msi_mask(ch);
> +        hpet_wake_cpus(ch);
> +        ch->cpu = -1;
> +        set_bit(ch->idx, &free_channels);

Wouldn't there be less cache line bouncing if the "free" flag was just
one of the channel flags?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.