[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about Xen reboot on panic



Hi Andrew,

2015-11-12 7:52 GMT-05:00 Andrew Cooper <andrew.cooper3@xxxxxxxxxx>:
> On 12/11/15 02:10, Meng Xu wrote:
>> Hi Andrew,
>>
>> 2015-11-11 18:34 GMT-05:00 Andrew Cooper <andrew.cooper3@xxxxxxxxxx>:
>>> On 11/11/2015 23:21, Meng Xu wrote:
>>>>> Finally, I can't tell from your paste below, but ensure that you are
>>>>> always using a debug hypervisor.
>>>> The source file Config.mk under the xen folder has
>>>> debug ?= y
>>>>
>>>> In addition,  "xl dmesg |grep debug" gives me:
>>>>
>>>> (XEN) Xen version 4.6-unstable (root@) (gcc (Ubuntu/Linaro
>>>> 4.6.3-1ubuntu5) 4.6.3) debug=y Wed Nov 11 17:06:30 EST 2015
>>>>
>>>> So I guess I'm using the debug hypervisor.
>>> You are
>>>
>>>> I reboot the system after removing all of those useless options (that
>>>> is, no more "reboot=k panic=2 panic_on_oops=1" in the Xen boot command
>>>> line.)
>>>>
>>>> Is there anything else I can do to force Xen always reboot at panic or 
>>>> oops?
>>> Unless you specify noreboot, Xen will try its hardest to reboot the
>>> system.  It is possible that you have a dodgy firmware which interacts
>>> poorly with the default methods.
>>>
>>> Does normal reboot from dom0 work as intended?
>> Yes. Before Xen crashes, I can reboot the machine dom0 or from a serial port.
>>
>>> If not, debug in the following order:
>>>
>>> * `reboot` from the dom0 shell
>>> * `echo b > /proc/sysrq-trigger` from the dom0 shell
>>> * `xl debug-keys R` from the dom0 shell
>> All of these three approaches can reboot the machine successfully.
>>
>>
>>> * CTRL-A x3, R from the serial console
>> I think "Ctrl-A" means that I should press "Ctrl + A" three times. Am I 
>> correct?
>> When I press Ctrl-A twice, it shows "No other window"; after I press
>> "Ctrl-A" for three consecutive times, and press R (or r), it shows
>> "+wrap" in the serial port.
>
> In which case the program you are using locally to connect to the serial
> console (Minicom / screen/ putty?) is intercepting CTRL-a for its own
> purposes.

I'm using screen.

>
> In screen for example, you need to send CTRL-a a to send a "CTRL-a" on
> the serial.
>

I see. After I use Ctrl-a a to send the Ctrl-a to the serial, I can
reboot the machine when the Xen kernel working well.
However, when the Xen kernel crashes, I cannot switch to Xen's (debug)
console and reboot it.

>>
>> From the serial console, I can press "Ctrl+o, b" to reboot the machine
>> when Xen hasn't crashed. But when Xen crashed, serial port won't work.
>
> Ctrl-o b is exactly the same as `echo b > /proc/sysrq-trigger`.
>
>>
>> BTW, the serial port is an PCI serial port instead of the legacy
>> serial port on the motherboard. Is the PCI serial port a problem? On
>> another machine with the legacy serial port, I can use "Ctrl - o, b"
>> to reboot even when system crashes. :-(
>
> Once dom0 stops responding to its console, CTRL-o won't help you at all.

>
>>
>>> Those are the reboot options.  It is also possible that a kexec kernel
>>> is being loaded and that is getting stuck.
>>>
>>> The crash options are:
>>> * `kexec -p`
>>> * `echo c > /proc/sysrq-trigger`
>>> * `xl debug-keys C`
>> The "xl debug-keys c" will not reboot the system, but it will print
>> out the crash message in the serial console.
>
> Right - in which case there is a problem on the crash path, rather than
> the reboot path.

Yes. I had a look at xl debug-keys before. ;-)

>
> Are you (or rather, your dom0) loading a crash kernel?

Sorry, I don't quite get what you mean in this sentence.
Did you mean that did I load a crash Xen kernel when I boot the machine?
What I did is:
I loaded a buggy Xen kernel (which  I know how to trigger xen to crash
in the scheduler), and caused Xen to crash by creating and destroying
a VM. Xen crashes when a VM is destroyed and I  try to reboot the
machine.

dom0's kernel is unmodified.

>
>>
>>> If those don't work then you will need to start instrumenting Xen to
>>> work out where stuff is going wrong.
>> It seems all of the above commands work on my machine (except for the
>> Ctrl-A x3, R). Is there anything else I can do to force the system
>> reboot at panic?
>
> Get CTRL-A working first.  That is simply a configuration interaction
> with the software you are using to connect to the serial console.

Yes, it works when Xen kernel works fine.

>
> Once you get that working, you will be able to use debug keys from the
> serial console itself, rather than via `xl debug-keys`.
>
> After that, you should start putting printk()s in machine_restart() to
> see where execution is actually getting to.

so that I can try to debug and fix the issue (if there exist an issue
in the code path). Am I right?

Thank you very much for your help! :-)

Best regards,

Meng



-- 


-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.