[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about Xen reboot on panic



Hi Jan,

2015-11-12 12:08 GMT-05:00 Jan Beulich <JBeulich@xxxxxxxx>:
>>>> On 12.11.15 at 17:57, <xumengpanda@xxxxxxxxx> wrote:
>>> After looking into the code, I found the following code in the
>>> machine_restart(), which is quite suspicious.
>>>
>>>     if ( system_state >= SYS_STATE_smp_boot )
>>>
>>>     {
>>>
>>>         local_irq_enable();
>>>
>>>
>>>         /* Ensure we are the boot CPU. */
>>>
>>>         if ( get_apic_id() != boot_cpu_physical_apicid )
>>
>> If we are at the boot CPU and the if statement return true
>>
>>>
>>>         {
>>>
>>>             /* Send IPI to the boot CPU (logical cpu 0). */
>>>
>>>             on_selected_cpus(cpumask_of(0), __machine_restart,
>>>
>>>                              &delay_millisecs, 0);
>>
>> we will send an IPI from CPU 0 to CPU to run machine_restart.
>
> The other way around you mean.
>
>>>
>>>             for ( ; ; )
>>>
>>>                 halt();
>>
>> and CPU 0 will halt immediately.
>>
>> If the IPI arrives later on CPU 0, CPU 0 won't be able to handle it,
>> since it has been halted.
>
> It's CPUn that gets halted, not CPU0. This ...

You are right.  When system_state > SYS_STATE_smp_boot, CPU i (i != 0)
will send IPI to CPU 0.


>
>> (XEN) On P0
>> As this line suggests, P0 sends P0 an IPI and P0 goes to halt immediately...
>
> ... is suspicious: Is boot_cpu_physical_apicid not set correctly?
> Or is get_apic_id() returning rubbish?

After printing out the boot_cpu_physical_apicid and get_apic_id, I
found that are correct!

However, the line after that if statement is:
smp_send_stop();

which is not in the if ( get_apic_id() != boot_cpu_physical_apicid ) statement.

So P0 may run this code, and from what I read from this
smp_send_stop(), it has the following code:

    local_irq_disable();

    __stop_this_cpu();

    disable_IO_APIC();

    hpet_disable();

    local_irq_enable();

I'm guessing at __stop_this_cpu() when it is on P0, P0 will be
stopped. That's why P0 will never have the chance to proceed to the
rest of logic in the machine_restart(). Therefore, the machine won't
restart.

If I move this  smp_send_stop(void) into the if statement, Xen will reboot.

Do you think this could be a fix?
If I misunderstood anything, please let me know...

Thanks,

Meng


-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.