[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Linux DomU freezes and dies under heavy memory shuffling



Hi Jürgen!

sorry for the belated reply -- I wanted to externalize the VM before I
do -- but let me at least reply to you:

On Tue, Feb 23, 2021 at 5:17 AM Jürgen Groß <jgross@xxxxxxxx> wrote:
>
> On 18.02.21 06:21, Roman Shaposhnik wrote:
> > On Wed, Feb 17, 2021 at 12:29 AM Jürgen Groß <jgross@xxxxxxxx
> > <mailto:jgross@xxxxxxxx>> wrote:
> >
> >     On 17.02.21 09:12, Roman Shaposhnik wrote:
> >      > Hi Jürgen, thanks for taking a look at this. A few comments below:
> >      >
> >      > On Tue, Feb 16, 2021 at 10:47 PM Jürgen Groß <jgross@xxxxxxxx
> >     <mailto:jgross@xxxxxxxx>> wrote:
> >      >>
> >      >> On 16.02.21 21:34, Stefano Stabellini wrote:
> >      >>> + x86 maintainers
> >      >>>
> >      >>> It looks like the tlbflush is getting stuck?
> >      >>
> >      >> I have seen this case multiple times on customer systems now, but
> >      >> reproducing it reliably seems to be very hard.
> >      >
> >      > It is reliably reproducible under my workload but it take a long time
> >      > (~3 days of the workload running in the lab).
> >
> >     This is by far the best reproduction rate I have seen up to now.
> >
> >     The next best reproducer seems to be a huge installation with several
> >     hundred hosts and thousands of VMs with about 1 crash each week.
> >
> >      >
> >      >> I suspected fifo events to be blamed, but just yesterday I've been
> >      >> informed of another case with fifo events disabled in the guest.
> >      >>
> >      >> One common pattern seems to be that up to now I have seen this
> >     effect
> >      >> only on systems with Intel Gold cpus. Can it be confirmed to be true
> >      >> in this case, too?
> >      >
> >      > I am pretty sure mine isn't -- I can get you full CPU specs if
> >     that's useful.
> >
> >     Just the output of "grep model /proc/cpuinfo" should be enough.
> >
> >
> > processor: 3
> > vendor_id: GenuineIntel
> > cpu family: 6
> > model: 77
> > model name: Intel(R) Atom(TM) CPU  C2550  @ 2.40GHz
> > stepping: 8
> > microcode: 0x12d
> > cpu MHz: 1200.070
> > cache size: 1024 KB
> > physical id: 0
> > siblings: 4
> > core id: 3
> > cpu cores: 4
> > apicid: 6
> > initial apicid: 6
> > fpu: yes
> > fpu_exception: yes
> > cpuid level: 11
> > wp: yes
> > flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp
> > lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
> > nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est
> > tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer
> > aes rdrand lahf_lm 3dnowprefetch cpuid_fault epb pti ibrs ibpb stibp
> > tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida
> > arat md_clear
> > vmx flags: vnmi preemption_timer invvpid ept_x_only flexpriority
> > tsc_offset vtpr mtf vapic ept vpid unrestricted_guest
> > bugs: cpu_meltdown spectre_v1 spectre_v2 mds msbds_only
> > bogomips: 4800.19
> > clflush size: 64
> > cache_alignment: 64
> > address sizes: 36 bits physical, 48 bits virtual
> > power management:
> >
> >      >
> >      >> In case anybody has a reproducer (either in a guest or dom0) with a
> >      >> setup where a diagnostic kernel can be used, I'd be _very_
> >     interested!
> >      >
> >      > I can easily add things to Dom0 and DomU. Whether that will
> >     disrupt the
> >      > experiment is, of course, another matter. Still please let me
> >     know what
> >      > would be helpful to do.
> >
> >     Is there a chance to switch to an upstream kernel in the guest? I'd like
> >     to add some diagnostic code to the kernel and creating the patches will
> >     be easier this way.
> >
> >
> > That's a bit tough -- the VM is based on stock Ubuntu and if I upgrade
> > the kernel I'll have fiddle with a lot things to make workload
> > functional again.
> >
> > However, I can install debug kernel (from Ubuntu, etc. etc.)
> >
> > Of course, if patching the kernel is the only way to make progress --
> > lets try that -- please let me know.
>
> I have found a nice upstream patch, which - with some modifications - I
> plan to give our customer as a workaround.
>
> The patch is for kernel 4.12, but chances are good it will apply to a
> 4.15 kernel, too.

I'm slightly confused about this patch -- it seems to me that it needs
to be applied to the guest kernel, correct?

If that's the case -- the challenge I have is that I need to re-build
the Canonical (Ubuntu) distro kernel with this patch -- this seems
a bit daunting at first (I mean -- I'm pretty good at rebuilding kernels
I just never do it with the vendor ones ;-)).

So... if there's anyone here who has any suggestions on how to do that
-- I'd appreciate pointers.

> I have been able to gather some more data.
>
> I have contacted the author of the upstream kernel patch I've been using
> for our customer (and that helped, by the way).
>
> It seems as if the problem is occurring when running as a guest at least
> under Xen, KVM, and VMWare, and there have been reports of bare metal
> cases, too. Hunting this bug is going on for several years now, the
> patch author is at it since 8 months.
>
> So we can rule out a Xen problem.
>
> Finding the root cause is still important, of course, and your setup
> seems to have the best reproduction rate up to now.
>
> So any help would really be appreciated.
>
> Is the VM self contained? Would it be possible to start it e.g. on a
> test system on my side? If yes, would you be allowed to pass it on to
> me?

I'm working on externalizing the VM in a way that doesn't disclose anything
about the customer workload. I'm almost there -- sans my question about
the vendor kernel rebuild. I plan to make that VM available this week.

Goes without saying, but I would really appreciate your help in chasing this.

Thanks,
Roman.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.