[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Live migrate with Linux >= 4.13 domU causes kernel time jumps and TCP connection stalls.


  • To: Juergen Gross <jgross@xxxxxxxx>, Hans van Kranenburg <Hans.van.Kranenburg@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Hans van Kranenburg <hans@xxxxxxxxxxx>
  • Date: Fri, 28 Dec 2018 15:41:47 +0100
  • Autocrypt: addr=hans@xxxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFo2pooBEADwTBe/lrCa78zuhVkmpvuN+pXPWHkYs0LuAgJrOsOKhxLkYXn6Pn7e3xm+ ySfxwtFmqLUMPWujQYF0r5C6DteypL7XvkPP+FPVlQnDIifyEoKq8JZRPsAFt1S87QThYPC3 mjfluLUKVBP21H3ZFUGjcf+hnJSN9d9MuSQmAvtJiLbRTo5DTZZvO/SuQlmafaEQteaOswme DKRcIYj7+FokaW9n90P8agvPZJn50MCKy1D2QZwvw0g2ZMR8yUdtsX6fHTe7Ym+tHIYM3Tsg 2KKgt17NTxIqyttcAIaVRs4+dnQ23J98iFmVHyT+X2Jou+KpHuULES8562QltmkchA7YxZpT mLMZ6TPit+sIocvxFE5dGiT1FMpjM5mOVCNOP+KOup/N7jobCG15haKWtu9k0kPz+trT3NOn gZXecYzBmasSJro60O4bwBayG9ILHNn+v/ZLg/jv33X2MV7oYXf+ustwjXnYUqVmjZkdI/pt 30lcNUxCANvTF861OgvZUR4WoMNK4krXtodBoEImjmT385LATGFt9HnXd1rQ4QzqyMPBk84j roX5NpOzNZrNJiUxj+aUQZcINtbpmvskGpJX0RsfhOh2fxfQ39ZP/0a2C59gBQuVCH6C5qsY rc1qTIpGdPYT+J1S2rY88AvPpr2JHZbiVqeB3jIlwVSmkYeB/QARAQABtCZIYW5zIHZhbiBL cmFuZW5idXJnIDxoYW5zQGtub3JyaWUub3JnPokCTgQTAQoAOBYhBOJv1o/B6NS2GUVGTueB VzIYDCpVBQJaNq7KAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJEOeBVzIYDCpVgDMQ ANSQMebh0Rr6RNhfA+g9CKiCDMGWZvHvvq3BNo9TqAo9BC4neAoVciSmeZXIlN8xVALf6rF8 lKy8L1omocMcWw7TlvZHBr2gZHKlFYYC34R2NvxS0xO8Iw5rhEU6paYaKzlrvxuXuHMVXgjj bM3zBiN8W4b9VW1MoynP9nvm1WaGtFI9GIyK9j6mBCU+N5hpvFtt4DBmuWjzdDkd3sWUufYd nQhGimWHEg95GWhQUiFvr4HRvYJpbjRRRQG3O/5Fm0YyTYZkI5CDzQIm5lhqKNqmuf2ENstS 8KcBImlbwlzEpK9Pa3Z5MUeLZ5Ywwv+d11fyhk53aT9bipdEipvcGa6DrA0DquO4WlQR+RKU ywoGTgntwFu8G0+tmD8J1UE6kIzFwE5kiFWjM0rxv1tAgV9ZWqmp3sbI7vzbZXn+KI/wosHV iDeW5rYg+PdmnOlYXQIJO+t0KmF5zJlSe7daylKZKTYtk7w1Fq/Oh1Rps9h1C4sXN8OAUO7h 1SAnEtehHfv52nPxwZiI6eqbvqV0uEEyLFS5pCuuwmPpC8AmOrciY2T8T+4pmkJNO2Nd3jOP cnJgAQrxPvD7ACp/85LParnoz5c9/nPHJB1FgbAa7N5d8ubqJgi+k9Q2lAL9vBxK67aZlFZ0 Kd7u1w1rUlY12KlFWzxpd4TuHZJ8rwi7PUceuQINBFo2sK8BEADSZP5cKnGl2d7CHXdpAzVF 6K4Hxwn5eHyKC1D/YvsY+otq3PnfLJeMf1hzv2OSrGaEAkGJh/9yXPOkQ+J1OxJJs9CY0fqB MvHZ98iTyeFAq+4CwKcnZxLiBchQJQd0dFPujtcoMkWgzp3QdzONdkK4P7+9XfryPECyCSUF ib2aEkuU3Ic4LYfsBqGR5hezbJqOs96ExMnYUCEAS5aeejr3xNb8NqZLPqU38SQCTLrAmPAX glKVnYyEVxFUV8EXXY6AK31lRzpCqmPxLoyhPAPda9BXchRluy+QOyg+Yn4Q2DSwbgCYPrxo HTZKxH+E+JxCMfSW35ZE5ufvAbY3IrfHIhbNnHyxbTRgYMDbTQCDyN9F2Rvx3EButRMApj+v OuaMBJF/fWfxL3pSIosG9Q7uPc+qJvVMHMRNnS0Y1QQ5ZPLG0zI5TeHzMnGmSTbcvn/NOxDe 6EhumcclFS0foHR78l1uOhUItya/48WCJE3FvOS3+KBhYvXCsG84KVsJeen+ieX/8lnSn0d2 ZvUsj+6wo+d8tcOAP+KGwJ+ElOilqW29QfV4qvqmxnWjDYQWzxU9WGagU3z0diN97zMEO4D8 SfUu72S5O0o9ATgid9lEzMKdagXP94x5CRvBydWu1E5CTgKZ3YZv+U3QclOG5p9/4+QNbhqH W4SaIIg90CFMiwARAQABiQRsBBgBCgAgFiEE4m/Wj8Ho1LYZRUZO54FXMhgMKlUFAlo2sK8C GwICQAkQ54FXMhgMKlXBdCAEGQEKAB0WIQRJbJ13A1ob3rfuShiywd9yY2FfbAUCWjawrwAK CRCywd9yY2FfbMKbEACIGLdFrD5j8rz/1fm8xWTJlOb3+o5A6fdJ2eyPwr5njJZSG9i5R28c dMmcwLtVisfedBUYLaMBmCEHnj7ylOgJi60HE74ZySX055hKECNfmA9Q7eidxta5WeXeTPSb PwTQkAgUZ576AO129MKKP4jkEiNENePMuYugCuW7XGR+FCEC2efYlVwDQy24ZfR9Q1dNK2ny 0gH1c+313l0JcNTKjQ0e7M9KsQSKUr6Tk0VGTFZE2dp+dJF1sxtWhJ6Ci7N1yyj3buFFpD9c kj5YQFqBkEwt3OGtYNuLfdwR4d47CEGdQSm52n91n/AKdhRDG5xvvADG0qLGBXdWvbdQFllm v47TlJRDc9LmwpIqgtaUGTVjtkhw0SdiwJX+BjhtWTtrQPbseDe2pN3gWte/dPidJWnj8zzS ggZ5otY2reSvM+79w/odUlmtaFx+IyFITuFnBVcMF0uGmQBBxssew8rePQejYQHz0bZUDNbD VaZiXqP4njzBJu5+nzNxQKzQJ0VDF6ve5K49y0RpT4IjNOupZ+OtlZTQyM7moag+Y6bcJ7KK 8+MRdRjGFFWP6H/RCSFAfoOGIKTlZHubjgetyQhMwKJQ5KnGDm+XUkeIWyevPfCVPNvqF2q3 viQm0taFit8L+x7ATpolZuSCat5PSXtgx1liGjBpPKnERxyNLQ/erRNcEACwEJliFbQm+c2i 6ccpx2cdtyAI1yzWuE0nr9DqpsEbIZzTCIVyry/VZgdJ27YijGJWesj/ie/8PtpDu0Cf1pty QOKSpC9WvRCFGJPGS8MmvzepmX2DYQ5MSKTO5tRJZ8EwCFfd9OxX2g280rdcDyCFkY3BYrf9 ic2PTKQokx+9sLCHAC/+feSx/MA/vYpY1EJwkAr37mP7Q8KA9PCRShJziiljh5tKQeIG4sz1 QjOrS8WryEwI160jKBBNc/M5n2kiIPCrapBGsL58MumrtbL53VimFOAJaPaRWNSdWCJSnVSv kCHMl/1fRgzXEMpEmOlBEY0Kdd1Ut3S2cuwejzI+WbrQLgeps2N70Ztq50PkfWkj0jeethhI FqIJzNlUqVkHl1zCWSFsghxiMyZmqULaGcSDItYQ+3c9fxIO/v0zDg7bLeG9Zbj4y8E47xqJ 6brtAAEJ1RIM42gzF5GW71BqZrbFFoI0C6AzgHjaQP1xfj7nBRSBz4ObqnsuvRr7H6Jme5rl eg7COIbm8R7zsFjF4tC6k5HMc1tZ8xX+WoDsurqeQuBOg7rggmhJEpDK2f+g8DsvKtP14Vs0 Sn7fVJi87b5HZojry1lZB2pXUH90+GWPF7DabimBki4QLzmyJ/ENH8GspFulVR3U7r3YYQ5K ctOSoRq9pGmMi231Q+xx9LkCDQRaOtArARAA50ylThKbq0ACHyomxjQ6nFNxa9ICp6byU9Lh hKOax0GB6l4WebMsQLhVGRQ8H7DT84E7QLRYsidEbneB1ciToZkL5YFFaVxY0Hj1wKxCFcVo CRNtOfoPnHQ5m/eDLaO4o0KKL/kaxZwTn2jnl6BQDGX1Aak0u4KiUlFtoWn/E/NIv5QbTGSw IYuzWqqYBIzFtDbiQRvGw0NuKxAGMhwXy8VP05mmNwRdyh/CC4rWQPBTvTeMwr3nl8/G+16/ cn4RNGhDiGTTXcX03qzZ5jZ5N7GLY5JtE6pTpLG+EXn5pAnQ7MvuO19cCbp6Dj8fXRmI0SVX WKSo0A2C8xH6KLCRfUMzD7nvDRU+bAHQmbi5cZBODBZ5yp5CfIL1KUCSoiGOMpMin3FrarIl cxhNtoE+ya23A+JVtOwtM53ESra9cJL4WPkyk/E3OvNDmh8U6iZXn4ZaKQTHaxN9yvmAUhZQ iQi/sABwxCcQQ2ydRb86Vjcbx+FUr5OoEyQS46gc3KN5yax9D3H9wrptOzkNNMUhFj0oK0fX /MYDWOFeuNBTYk1uFRJDmHAOp01rrMHRogQAkMBuJDMrMHfolivZw8RKfdPzgiI500okLTzH C0wgSSAOyHKGZjYjbEwmxsl3sLJck9IPOKvqQi1DkvpOPFSUeX3LPBIav5UUlXt0wjbzInUA EQEAAYkCNgQYAQoAIBYhBOJv1o/B6NS2GUVGTueBVzIYDCpVBQJaOtArAhsMAAoJEOeBVzIY DCpV4kgP+wUh3BDRhuKaZyianKroStgr+LM8FIUwQs3Fc8qKrcDaa35vdT9cocDZjkaGHprp mlN0OuT2PB+Djt7am2noV6Kv1C8EnCPpyDBCwa7DntGdGcGMjH9w6aR4/ruNRUGS1aSMw8sR QgpTVWEyzHlnIH92D+k+IhdNG+eJ6o1fc7MeC0gUwMt27Im+TxVxc0JRfniNk8PUAg4kvJq7 z7NLBUcJsIh3hM0WHQH9AYe/mZhQq5oyZTsz4jo/dWFRSlpY7zrDS2TZNYt4cCfZj1bIdpbf SpRi9M3W/yBF2WOkwYgbkqGnTUvr+3r0LMCH2H7nzENrYxNY2kFmDX9bBvOWsWpcMdOEo99/ Iayz5/q2d1rVjYVFRm5U9hG+C7BYvtUOnUvSEBeE4tnJBMakbJPYxWe61yANDQubPsINB10i ngzsm553yqEjLTuWOjzdHLpE4lzD416ExCoZy7RLEHNhM1YQSI2RNs8umlDfZM9Lek1+1kgB vT3RH0/CpPJgveWV5xDOKuhD8j5l7FME+t2RWP+gyLid6dE0C7J03ir90PlTEkMEHEzyJMPt OhO05Phy+d51WPTo1VSKxhL4bsWddHLfQoXW8RQ388Q69JG4m+JhNH/XvWe3aQFpYP+GZuzO hkMez0lHCaVOOLBSKHkAHh9i0/pH+/3hfEa4NsoHCpyy
  • Cc: Igor Yurchenko <Igor.Yurchenko@xxxxxxxxxx>
  • Delivery-date: Fri, 28 Dec 2018 14:41:59 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 12/28/18 11:15 AM, Juergen Gross wrote:
> On 27/12/2018 22:12, Hans van Kranenburg wrote:
>> So,
>>
>> On 12/24/18 1:32 AM, Hans van Kranenburg wrote:
>>>
>>> On 12/21/18 6:54 PM, Hans van Kranenburg wrote:
>>>>
>>>> We've been tracking down a live migration bug during the last three days
>>>> here at work, and here's what we found so far.
>>>>
>>>> 1. Xen version and dom0 linux kernel version don't matter.
>>>> 2. DomU kernel is >= Linux 4.13.
>>>>
>>>> When using live migrate to another dom0, this often happens:
>>>>
>>>> [   37.511305] Freezing user space processes ... (elapsed 0.001 seconds)
>>>> done.
>>>> [   37.513316] OOM killer disabled.
>>>> [   37.513323] Freezing remaining freezable tasks ... (elapsed 0.001
>>>> seconds) done.
>>>> [   37.514837] suspending xenstore...
>>>> [   37.515142] xen:grant_table: Grant tables using version 1 layout
>>>> [18446744002.593711] OOM killer enabled.
>>>> [18446744002.593726] Restarting tasks ... done.
>>>> [18446744002.604527] Setting capacity to 6291456
>>>
>>> Tonight, I've been through 29 bisect steps to figure out a bit more. A
>>> make defconfig with enabling Xen PV for domU reproduces the problem
>>> already, so a complete cycle with compiling and testing had only to take
>>> about 7 minutes.
>>>
>>> So, it appears that this 18 gazillion seconds of uptime is a thing that
>>> started happening earlier than the TCP situation already. All of the
>>> test scenarios resulted in these huge uptime numbers in dmesg. Not all
>>> of them result in TCP connections hanging.
>>>
>>>> As a side effect, all open TCP connections stall, because the timestamp
>>>> counters of packets sent to the outside world are affected:
>>>>
>>>> https://syrinx.knorrie.org/~knorrie/tmp/tcp-stall.png
>>>
>>> This is happening since:
>>>
>>> commit 9a568de4818dea9a05af141046bd3e589245ab83
>>> Author: Eric Dumazet <edumazet@xxxxxxxxxx>
>>> Date:   Tue May 16 14:00:14 2017 -0700
>>>
>>>     tcp: switch TCP TS option (RFC 7323) to 1ms clock
>>>
>>> [...]
>>>
>>>> [...]
>>>>
>>>> 3. Since this is related to time and clocks, the last thing today we
>>>> tried was, instead of using default settings, put "clocksource=tsc
>>>> tsc=stable:socket" on the xen command line and "clocksource=tsc" on the
>>>> domU linux kernel line. What we observed after doing this, is that the
>>>> failure happens less often, but still happens. Everything else applies.
>>>
>>> Actually, it seems that the important thing is that uptime of the dom0s
>>> is not very close to each other. After rebooting all four back without
>>> tsc options, and then a few hours later rebooting one of them again, I
>>> could easily reproduce again when live migrating to the later rebooted
>>> server.
>>>
>>>> Additional question:
>>>>
>>>> It's 2018, should we have these "clocksource=tsc tsc=stable:socket" on
>>>> Xen and "clocksource=tsc" anyways now, for Xen 4.11 and Linux 4.19
>>>> domUs? All our hardware has 'TscInvariant = true'.
>>>>
>>>> Related: https://news.ycombinator.com/item?id=13813079
>>>
>>> This is still interesting.
>>>
>>> ---- >8 ----
>>>
>>> Now, the next question is... is 9a568de481 bad, or shouldn't there be 18
>>> gazillion whatever uptime already... In Linux 4.9, this doesn't happen,
>>> so next task will be to find out where that started.
>>
>> And that's...
>>
>> commit f94c8d116997597fc00f0812b0ab9256e7b0c58f
>> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Date:   Wed Mar 1 15:53:38 2017 +0100
>>
>>     sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface
>>
>> a.k.a. v4.11-rc2~30^2
>>
>> Before this commit, time listed in dmesg seems to follow uptime of the
>> domU, and after it, time in dmesg seems to jump around up and down when
>> live migrating to different dom0s, with the occasional/frequent jump to
>> a number above 18000000000 which then also shows the TCP timestamp
>> breakage since 9a568de4.
>>
>> So, next question is... what now? Any ideas appreciated.
>>
>> Can anyone else reproduce this? I have super-common HP DL360 hardware
>> and mostly default settings, so it shouldn't be that hard.
>>
>> Should I mail some other mailinglist with a question? Which one? Does
>> any of you Xen developers have more experience with time keeping code?
> 
> My gut feeling tells me that above patch was neglecting Xen by setting
> a non-native TSC clock too often to "stable" (the "only call
> clear_sched_clock_stable() when we mark TSC unstable when we use
> native_sched_clock()" part of the commit message).
> 
> I can have a more thorough look after Jan. 7th.

Thanks in advance!

Some additional info:

I've just left a domU running after the initial live migrate:

[  171.727462] Freezing user space processes ... (elapsed 0.002 seconds)
done.
[  171.729825] OOM killer disabled.
[  171.729832] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[  171.731439] suspending xenstore...
[  171.731672] xen:grant_table: Grant tables using version 1 layout
[18446742891.874140] OOM killer enabled.
[18446742891.874152] Restarting tasks ... done.
[18446742891.914103] Setting capacity to 6291456
[18446742934.549790] 14:13:50 up 3 min, 2 users, load average: 0.07,
0.02, 0.00
[18446742935.561404] 14:13:51 up 3 min, 2 users, load average: 0.07,
0.02, 0.00
[18446742936.572761] 14:13:52 up 3 min, 2 users, load average: 0.06,
0.02, 0.00
[18446742937.583537] 14:13:53 up 3 min, 2 users, load average: 0.06,
0.02, 0.00

I'm simply doing this:
while true; do echo $(uptime) > /dev/kmsg; sleep 10; done

Now, after a while, this happens:

[18446744050.202985] 14:32:26 up 22 min, 2 users, load average: 0.00,
0.00, 0.00
[18446744060.214576] 14:32:36 up 22 min, 2 users, load average: 0.00,
0.00, 0.00
[18446744070.225909] 14:32:46 up 22 min, 2 users, load average: 0.00,
0.00, 0.00
[    6.527718] 14:32:56 up 22 min, 2 users, load average: 0.00, 0.00, 0.00
[   16.539315] 14:33:06 up 22 min, 2 users, load average: 0.00, 0.00, 0.00
[   26.550511] 14:33:16 up 23 min, 2 users, load average: 0.00, 0.00, 0.00

The 23 minutes difference is exactly the difference in uptime between
the two dom0s involved for live migration:

source dom0: up 4 days, 19:23
destination dom0: up 4 days, 19:00

So that explains the 18446742891.874140 number, which just corresponds
to something near to 'minus 23 minutes'.

Happy holidays,

Hans

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.