[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Linux DomU freezes and dies under heavy memory shuffling



On 17.02.21 09:12, Roman Shaposhnik wrote:
Hi Jürgen, thanks for taking a look at this. A few comments below:

On Tue, Feb 16, 2021 at 10:47 PM Jürgen Groß <jgross@xxxxxxxx> wrote:

On 16.02.21 21:34, Stefano Stabellini wrote:
+ x86 maintainers

It looks like the tlbflush is getting stuck?

I have seen this case multiple times on customer systems now, but
reproducing it reliably seems to be very hard.

It is reliably reproducible under my workload but it take a long time
(~3 days of the workload running in the lab).

This is by far the best reproduction rate I have seen up to now.

The next best reproducer seems to be a huge installation with several
hundred hosts and thousands of VMs with about 1 crash each week.


I suspected fifo events to be blamed, but just yesterday I've been
informed of another case with fifo events disabled in the guest.

One common pattern seems to be that up to now I have seen this effect
only on systems with Intel Gold cpus. Can it be confirmed to be true
in this case, too?

I am pretty sure mine isn't -- I can get you full CPU specs if that's useful.

Just the output of "grep model /proc/cpuinfo" should be enough.


In case anybody has a reproducer (either in a guest or dom0) with a
setup where a diagnostic kernel can be used, I'd be _very_ interested!

I can easily add things to Dom0 and DomU. Whether that will disrupt the
experiment is, of course, another matter. Still please let me know what
would be helpful to do.

Is there a chance to switch to an upstream kernel in the guest? I'd like
to add some diagnostic code to the kernel and creating the patches will
be easier this way.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.