[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Live migrate with Linux >= 4.13 domU causes kernel time jumps and TCP connection stalls.
On 12/28/18 11:15 AM, Juergen Gross wrote: > On 27/12/2018 22:12, Hans van Kranenburg wrote: >> So, >> >> On 12/24/18 1:32 AM, Hans van Kranenburg wrote: >>> >>> On 12/21/18 6:54 PM, Hans van Kranenburg wrote: >>>> >>>> We've been tracking down a live migration bug during the last three days >>>> here at work, and here's what we found so far. >>>> >>>> 1. Xen version and dom0 linux kernel version don't matter. >>>> 2. DomU kernel is >= Linux 4.13. >>>> >>>> When using live migrate to another dom0, this often happens: >>>> >>>> [ 37.511305] Freezing user space processes ... (elapsed 0.001 seconds) >>>> done. >>>> [ 37.513316] OOM killer disabled. >>>> [ 37.513323] Freezing remaining freezable tasks ... (elapsed 0.001 >>>> seconds) done. >>>> [ 37.514837] suspending xenstore... >>>> [ 37.515142] xen:grant_table: Grant tables using version 1 layout >>>> [18446744002.593711] OOM killer enabled. >>>> [18446744002.593726] Restarting tasks ... done. >>>> [18446744002.604527] Setting capacity to 6291456 >>> >>> Tonight, I've been through 29 bisect steps to figure out a bit more. A >>> make defconfig with enabling Xen PV for domU reproduces the problem >>> already, so a complete cycle with compiling and testing had only to take >>> about 7 minutes. >>> >>> So, it appears that this 18 gazillion seconds of uptime is a thing that >>> started happening earlier than the TCP situation already. All of the >>> test scenarios resulted in these huge uptime numbers in dmesg. Not all >>> of them result in TCP connections hanging. >>> >>>> As a side effect, all open TCP connections stall, because the timestamp >>>> counters of packets sent to the outside world are affected: >>>> >>>> https://syrinx.knorrie.org/~knorrie/tmp/tcp-stall.png >>> >>> This is happening since: >>> >>> commit 9a568de4818dea9a05af141046bd3e589245ab83 >>> Author: Eric Dumazet <edumazet@xxxxxxxxxx> >>> Date: Tue May 16 14:00:14 2017 -0700 >>> >>> tcp: switch TCP TS option (RFC 7323) to 1ms clock >>> >>> [...] >>> >>>> [...] >>>> >>>> 3. Since this is related to time and clocks, the last thing today we >>>> tried was, instead of using default settings, put "clocksource=tsc >>>> tsc=stable:socket" on the xen command line and "clocksource=tsc" on the >>>> domU linux kernel line. What we observed after doing this, is that the >>>> failure happens less often, but still happens. Everything else applies. >>> >>> Actually, it seems that the important thing is that uptime of the dom0s >>> is not very close to each other. After rebooting all four back without >>> tsc options, and then a few hours later rebooting one of them again, I >>> could easily reproduce again when live migrating to the later rebooted >>> server. >>> >>>> Additional question: >>>> >>>> It's 2018, should we have these "clocksource=tsc tsc=stable:socket" on >>>> Xen and "clocksource=tsc" anyways now, for Xen 4.11 and Linux 4.19 >>>> domUs? All our hardware has 'TscInvariant = true'. >>>> >>>> Related: https://news.ycombinator.com/item?id=13813079 >>> >>> This is still interesting. >>> >>> ---- >8 ---- >>> >>> Now, the next question is... is 9a568de481 bad, or shouldn't there be 18 >>> gazillion whatever uptime already... In Linux 4.9, this doesn't happen, >>> so next task will be to find out where that started. >> >> And that's... >> >> commit f94c8d116997597fc00f0812b0ab9256e7b0c58f >> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Date: Wed Mar 1 15:53:38 2017 +0100 >> >> sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface >> >> a.k.a. v4.11-rc2~30^2 >> >> Before this commit, time listed in dmesg seems to follow uptime of the >> domU, and after it, time in dmesg seems to jump around up and down when >> live migrating to different dom0s, with the occasional/frequent jump to >> a number above 18000000000 which then also shows the TCP timestamp >> breakage since 9a568de4. >> >> So, next question is... what now? Any ideas appreciated. >> >> Can anyone else reproduce this? I have super-common HP DL360 hardware >> and mostly default settings, so it shouldn't be that hard. >> >> Should I mail some other mailinglist with a question? Which one? Does >> any of you Xen developers have more experience with time keeping code? > > My gut feeling tells me that above patch was neglecting Xen by setting > a non-native TSC clock too often to "stable" (the "only call > clear_sched_clock_stable() when we mark TSC unstable when we use > native_sched_clock()" part of the commit message). > > I can have a more thorough look after Jan. 7th. Thanks in advance! Some additional info: I've just left a domU running after the initial live migrate: [ 171.727462] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 171.729825] OOM killer disabled. [ 171.729832] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 171.731439] suspending xenstore... [ 171.731672] xen:grant_table: Grant tables using version 1 layout [18446742891.874140] OOM killer enabled. [18446742891.874152] Restarting tasks ... done. [18446742891.914103] Setting capacity to 6291456 [18446742934.549790] 14:13:50 up 3 min, 2 users, load average: 0.07, 0.02, 0.00 [18446742935.561404] 14:13:51 up 3 min, 2 users, load average: 0.07, 0.02, 0.00 [18446742936.572761] 14:13:52 up 3 min, 2 users, load average: 0.06, 0.02, 0.00 [18446742937.583537] 14:13:53 up 3 min, 2 users, load average: 0.06, 0.02, 0.00 I'm simply doing this: while true; do echo $(uptime) > /dev/kmsg; sleep 10; done Now, after a while, this happens: [18446744050.202985] 14:32:26 up 22 min, 2 users, load average: 0.00, 0.00, 0.00 [18446744060.214576] 14:32:36 up 22 min, 2 users, load average: 0.00, 0.00, 0.00 [18446744070.225909] 14:32:46 up 22 min, 2 users, load average: 0.00, 0.00, 0.00 [ 6.527718] 14:32:56 up 22 min, 2 users, load average: 0.00, 0.00, 0.00 [ 16.539315] 14:33:06 up 22 min, 2 users, load average: 0.00, 0.00, 0.00 [ 26.550511] 14:33:16 up 23 min, 2 users, load average: 0.00, 0.00, 0.00 The 23 minutes difference is exactly the difference in uptime between the two dom0s involved for live migration: source dom0: up 4 days, 19:23 destination dom0: up 4 days, 19:00 So that explains the 18446742891.874140 number, which just corresponds to something near to 'minus 23 minutes'. Happy holidays, Hans _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |