[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Host freezing after "fixing" recursive fault starting in multicalls.c



On 29.01.2020 09:29, Peter.Kurfer@xxxxxxxx wrote:
> As requested I configured one host with:
> 
>> loglvl=all guest_loglvl=all
> 
> and collected one day of logs via serial interface:
> 
> https://drive.google.com/drive/folders/1sQvyNH0Sz28tUeVRZl9mowhB0Htd8ZpO?usp=sharing
> 
> searching for "error" or "multicalls.c" leads to some stacktraces that might 
> be interesting.

Right, but the bad news is that there are no helpful hypervisor
messages at all. Sadly this is partly my fault, because I should
have asked you to do this log collection with a debug hypervisor.
Most of the possibly interesting messages would appear only there.

In any event, problems start quite a bit earlier, and typically
it's the first instance of a problem that is the most helpful to
analyze, as later ones may be cascade issues. The first sign of
problems is an overlapping

[14991.827762] BUG: unable to handle page fault for address: ffff888ae2eb6bd8

and

[14991.828172] WARNING: CPU: 5 PID: 2585 at arch/x86/xen/multicalls.c:102 
xen_mc_flush+0x194/0x1c0

on CPUs 8 and 5.

> As far as I know the ACPI errors in the context of IPMI can be ignored.

Looks like so, yes, at least for the purposes here. What I wouldn't
put off as a possible reason for problems is the significant amount
of temperature related messages. What I also find at least curious
(but possibly just because I know too little of the respective
aspects of modern kernels) are the recurring __text_poke() instances
on the stack traces. Assuming these are to be expected in the first
place, there might be a race here which is either Xen-specific or
simply has a much better chance of hitting (larger window?) when
running on Xen. But I'm afraid this will need looking into (or at
least commenting on) by a kernel person.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.