[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recent upgrade of 4.13 -> 4.14 issue



Hi list,
    This is a reply to the thread of the same title (linked here:
https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxxxxxxxxx/msg84916.html
) which I could not reply to because I receive this list by digest.

    I'm unclear if this is exactly the reason, but I experienced the
same symptoms when upgrading to 4.14. The issue does not occur if I
downgrade to 4.11 (the previous version that was provided by Debian).
Kernel is 5.9.11 and unchanged between xen versions.

    One thing I noticed is that if I disable the monitor/mwait
instructions on my CPU (Intel Xeon E5-2699 v4 ES), the stalls seem to
occur later into the boot. With the instructions enabled, the system
usually stalls less than a few minutes after boot; disabled, it can
last for tens of minutes.

    Further disabling the HPET or forcing the kernel to use PIT causes
it to be somewhat usable. The stalls still occur tens of minutes in
but somehow everything seems to continue chugging along fine?

    I've also verified that the stalls do not occur in all the above
cases if I just boot into the kernel without xen.

    When the stalls happen, I get the "rcu: INFO: rcu_sched detected
stalls on CPUs/tasks" backtraces printed on the console periodically,
but keystrokes don't do anything on the console, and I can't spawn new
SSH sessions even though pinging the system produces a reply. The last
item in the call trace is usually "xen_safe_halt", but I've seen it
occur for other functions related to btrfs and the network adapter as
well.

    Do let me know if there's anything I can provide to help
troubleshoot this. At the moment I've reverted to 4.11, but I can
temporarily switch over to 4.14 to collect any necessary information.

Liwei



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.