[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Xen/timer: Disable watchdog during dumping timer queues



>>> On 13.09.16 at 16:56, <tianyu.lan@xxxxxxxxx> wrote:

Seems like mistakenly dropped xen-devel in my earlier reply; re-added.

> On 9/13/2016 5:02 PM, Jan Beulich wrote:
>>>>> On 13.09.16 at 09:12, <tianyu.lan@xxxxxxxxx> wrote:
>>> On a machine with a mount of cpus, dump_timerq() lasts several seconds
>>> which may exceed watchdog timeout and cause Xen hyperviosr reboot.
>>> This patch is to disable watchdog when dump timer queues to fix the
>>> issue.
>>
>> But disabling the watchdog is not the resolution. Instead you
>> want to invoke process_pending_softirqs() every once in a while,
>> just like certain other key handlers do.
> 
> Actually, I have tried that way but it's not enough.
> Keyhandler may run in the timer handler and the following log shows
> calltrace. The timer subsystem run all expired timers' handler
> before programing next timer event. If keyhandler runs longer than
> timeout, there will be no chance to configure timer before triggering
> watchdog and hypervisor rebooting.
> 
> (XEN)    [<ffff82d0801145be>] handle_keypress+0x7b/0xa5
> (XEN)    [<ffff82d080142dff>] console.c#__serial_rx+0x16/0x54
> (XEN)    [<ffff82d080142ed1>] console.c#serial_rx+0x94/0x9f
> (XEN)    [<ffff82d080145641>] serial_rx_interrupt+0xa9/0xbf
> (XEN)    [<ffff82d080143b5f>] ns16550.c#__ns16550_poll+0x53/0xb4
> (XEN)    [<ffff82d080197c1c>] do_invalid_op+0x26d/0x49f
> (XEN)    [<ffff82d08023cfb0>] cpufreq.c#handle_exception_saved+0x2e/0x6c
> (XEN)    [<ffff82d080143594>] ns16550.c#ns16550_poll+0x20/0x24
> (XEN)    [<ffff82d080130769>] timer.c#execute_timer+0x4e/0x6c
> (XEN)    [<ffff82d0801308b7>] timer.c#timer_softirq_action+0xea/0x222
> (XEN)    [<ffff82d08012cc8a>] softirq.c#__do_softirq+0x9b/0xac
> (XEN)    [<ffff82d08012ccae>] do_softirq+0x13/0x15
> (XEN)    [<ffff82d08016400a>] domain.c#idle_loop+0x78/0x88

Wait - what is do_invalid_op() doing on the stack? I don't think it
belongs there, and hence I wonder whether the keypress
happened after some already fatal event (in which case all bets
are off anyway).

> Another solution is to schedule a tasklet to run keyhandler in timer
> handler and invoke process_pending_softirqs() in the dump_timerq().
> This also works but it requires to rework keyhandler mechanism.
> 
> Disable watchdog seems to be simpler and I found dump_registers() also
> used the same way to deal with the issue.

That's true. Just that on large machines it defaults to the
alternative model, for which I'm not sure it actually needs the
watchdog disabled (as data for a single CPU shouldn't exceed
the threshold).

Jan

> Here is my draft patch of reworking keyhandler.
> diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
> index 16de6e8..579d076 100644
> --- a/xen/common/keyhandler.c
> +++ b/xen/common/keyhandler.c
> @@ -75,9 +75,10 @@ static struct keyhandler {
> 
>   static void keypress_action(unsigned long unused)
>   {
> -    handle_keypress(keypress_key, NULL);
> +    console_start_log_everything();
> +    h->fn(keypress_key);
> +    console_end_log_everything();
>   }
> -
>   static DECLARE_TASKLET(keypress_tasklet, keypress_action, 0);
> 
>   void handle_keypress(unsigned char key, struct cpu_user_regs *regs)
> @@ -87,10 +88,10 @@ void handle_keypress(unsigned char key, struct 
> cpu_user_regs *regs)
>       if ( key >= ARRAY_SIZE(key_table) || !(h = &key_table[key])->fn )
>           return;
> 
> -    if ( !in_irq() || h->irq_callback )
> +    if (h->irq_callback))
>       {
>           console_start_log_everything();
> -        h->irq_callback ? h->irq_fn(key, regs) : h->fn(key);
> +        h->irq_fn(key, regs);
>           console_end_log_everything();
>       }
>       else
> 
> 
>>
>> Jan
>>




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.