[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/events: xen_evtchn_fifo_init can be called very late



On 28/01/14 00:34, Julien Grall wrote:
> On ARM, xen_init_IRQ (which calls xen_evtchn_fifo_init) is called after
> all CPUs are online. It would mean that the notifier will never be called.

Why does ARM call xen_init_IRQ() so late?  Is it possible to call it
earlier when only the boot CPU is online?  There are problems with
attempting to init FIFO event channels after all CPUs are online.

If evtchn_fifo_init_control_block(cpu) fails on anything other than the
first CPU, that CPU will be unable to receive any events.  Xen will have
been switched to FIFO mode and it is not possible to revert back to
2-level mode.

> Therefore, when a secondary CPU will receive an interrupt, Linux will segfault
> because the event channel structure for this processor is not initialized.
> 
> This can be fixed by calling the init function on every online cpu when the
> event channel fifo driver is initialized.
> 
> Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
> ---
>  drivers/xen/events/events_fifo.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/xen/events/events_fifo.c 
> b/drivers/xen/events/events_fifo.c
> index 1de2a19..15498ab 100644
> --- a/drivers/xen/events/events_fifo.c
> +++ b/drivers/xen/events/events_fifo.c
> @@ -410,12 +410,14 @@ static struct notifier_block evtchn_fifo_cpu_notifier = 
> {
>  
>  int __init xen_evtchn_fifo_init(void)
>  {
> -     int cpu = get_cpu();
> +     int cpu;
>       int ret;
>  
> -     ret = evtchn_fifo_init_control_block(cpu);
> -     if (ret < 0)
> -             goto out;
> +     for_each_online_cpu(cpu) {
> +             ret = evtchn_fifo_init_control_block(cpu);
> +             if (ret < 0)
> +                     goto out;

You need to handle this error differently depending on whether the first
call fails or not.

Failure on first CPU: return an error and the caller will fallback to
using 2-level mode.

Failure on second or later CPU: you need to offline that CPU.  It may
not be possible to offline a CPU with standard calls (e.g., cpu_down())
as it won't have working interrupts.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.