[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recent upgrade of 4.13 -> 4.14 issue


  • To: Dario Faggioli <dfaggioli@xxxxxxxx>, Jürgen Groß <jgross@xxxxxxxx>, "George.Dunlap@xxxxxxxxxx" <George.Dunlap@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Frédéric Pierret <frederic.pierret@xxxxxxxxxxxx>
  • Date: Tue, 27 Oct 2020 17:06:58 +0100
  • Arc-authentication-results: i=1; mx.zohomail.com; dkim=pass header.i=qubes-os.org; spf=pass smtp.mailfrom=frederic.pierret@xxxxxxxxxxxx; dmarc=pass header.from=<frederic.pierret@xxxxxxxxxxxx> header.from=<frederic.pierret@xxxxxxxxxxxx>
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1603825463; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=zwZzKeIMfs869uqUN+jylJxJAsXo+50eS/lmT0P948M=; b=dEu99EK5apmmc0gVSHlEToj3XOqjcpg3BE+MuMrhRY7SfY+1bLhqSN02DsyZ2V0LujaV1RXj/c/DPkwKSKpTaOJSP8HULeuuAq2KFXiC/IWGJNJuEzOWdYUlQNh7TxienDnjcp1y5gAy9+z2loWvi0IUz9vkAaIkPhxgZmvM0XA=
  • Arc-seal: i=1; a=rsa-sha256; t=1603825463; cv=none; d=zohomail.com; s=zohoarc; b=CWgnlwuf+vl/YIgwTr3I/a9bfD6ff5T7Wv8MysM/tdKJD8dgEIaZ1zVpNMtPwajd+nfoZST8CGyWFI17YA2xar4eIQnjEGWJJX8n+lSdGFxXcamUea6ngREYoLRTyQE20Lls7lDz6MMFSaa+ALrsieIUcZfjy6fzlCTlEPseekg=
  • Autocrypt: addr=frederic.pierret@xxxxxxxxxxxx; keydata= xsFNBFwkq3EBEADcfyaOkeuf+g96S1ieq05tJ8vTGsQrNXQ5RDE7ffagL0+EpfIP3x73x5Q0 Dy2rUVQ+oN1DHcueNL70RtNs9BFnoW0KZnskbT4nEJ9wQCQa22lQaIk9kCNVddh2HJKljtd8 vtovi97sWIjtzxx5Qwc2md0DY9AHhNC4KqKIW3tSPC17UsI8fASoNAHItYtyn2bO67p8pCIv ltoBrYnElD1Pyp5IGWiD2/YD325iPl2+qHVkUSWmb92hRRU19Rg+Uds8bVHqhz4cOqIE7jpX gYzTN/kq8sxBMh2OrQ/bSxLaccaNApIVSZVSAasVJfdscNDL9fjkHERK/AiSTleHrsgLf4PL w5koqPs/6JEIVI+t0pyg+Pa8uwFoeYTPrLSlw0f7bXSmlVfv8g7M7RWmk3T5QIpeHA0j3lEZ NbYRXzkI91HCt40X2bTb2jTKgvB9jQjEarpk6euvGs2Ig/U4MlUy3pG5Ehd2Ebn8Rz31JXpa A/GPaJ5DjzV0q9mkYkGDLYI3J/J+s2u0Kr0VswLaIN3WJn7kKEDwfc4s2kaAYfblE/p0zVir EVBum723MFH4DxhTrOoWgta2nyRHOoi0z0EVhYA+D86mFPWKb9roWvtnmFlssggGmqbJEMvt LbYnlSt3v32nfUXh12aQPwU/LCGIzq4oFNVrNp3aWPnSajLPpQARAQABzTxGcsOpZMOpcmlj IFBpZXJyZXQgKGZlcGl0cmUpIDxmcmVkZXJpYy5waWVycmV0QHF1YmVzLW9zLm9yZz7CwXgE EwECACIFAlwkq3ECGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEEhAELXNxXbiPLkQ AI6kEDyLl0TpvRDOanuD5YkVHLEYVuG62CJNwMjFoFRgZJnl+Fb5HBgthU9lBdMqNySg+s8y ekM9KRlUHKYjwAsyjPIjRtca4bH3V11/waKpvPBgPsC75CxSZ9uITprfEqX7V2OLbrYW94qw R8jX+n/wlEGG3pbfXG7FTnjxQWM0E0aSvO0Yb5EkjiJ7cwEiqvL04Uekt5I2Zc8iRDF9kneI NiNhzRtvrR1UN6KtiZNSk2NsLOptrUQ/1AU5jwH4mnQQymtYDsWddlRoDRC/bsAow7cBudj+ lekM3cNRZOazKZx5UPnN8nqvD7FqeAcZBVyrHZ4hcWqABaJEPv6CCHRiLQnGR9ze2O5Yh+/B unrOJdjdsib1ZECH9GtIcj4mmPAN84NO4r8a6Sn9jsXkd2Wj2N5wNrZMPslhfiaW2VHTfLmA Ot+wRwLRsFfqLykF8hMlNXXE4frxotwa6+PTd48Ws9H9aalSs0lebsG0623b4mBjy1coxFUw eclPInXsPEdu/Yu2r7xrgGouXH8KgDhqlqq60UaA5n/0XhIeZ8tBTYs+1B5/C9TjvNAUsBko b1EpfW3J4Gq14GqwK+eodOTL5t2f2PWN/IQyop/j0FMgVU5/PUS0pciz5ybyIJBLhbsJBvKb xM/NyxHrmNwGEknpoeq+XT8rEJ+/Ag8Wnjl0zsFNBFwkq3EBEADAPJdyFy4KeYpuGATWwWCN He8XNVqBplV0yVlT5pSiCyA3UK34JlGX9YJOj/FlMZGgh61vbiK+piRjm/lyb128wpMjnoOm qpbSLbra8NP8Mu5FZMcv8OxrSIr/RHq2heFg1j11QOMGwe6vPC918qpzmiaYj2qpKY/RYsG8 V+9+dpLEU75+mpHU7GlECfPmHYbnsismL/4+xH+8BG56yg0UFbfrNYonIQFSn5k/w6i7jt7M ++ZmWfEV5nCP2qvzeYDGAL6BbWVOjuDhrKsAIKnomCyy+MjcVP955PVdN2+OlPJng07oKtQr 5aNCaNpv/i4gLO1IScdfDwm6gdfB2Zg/7jTJrKw0kWPFl9rHfN7dLAR28u3uT8Rhicjdd7hg YlDWdbImhNL/Z7iL3eayH7T9qAVNU587MhWvIREyE1gj22cs0e1m6qMFpbFYG0709N2UwlpA H+Pd35bTi9q2o1pH91xBYH6QvvrwsuVYHwuc3xXLRVRXWXY8xvNFSlY1LB8A46JOtV/ZodYD yhxVGbeWp820cb0s1f689XCXqFYAzTfCit+EeboYORN5CGioXzS+z0S9IhPbdUuvqs7xvC24 8bM7nm84YdgVM7HWybOtpRpWpycwGs73IvbxyLE9aPe/Zw4PTKWvbJlcFioofLwTQE1XvWom FPD9LLrBl5NUjQARAQABwsFfBBgBAgAJBQJcJKtxAhsMAAoJEEhAELXNxXbilSkP/2NcazvU DGyQLm7tFp4HNqSQfFJ3+chzxfOOdNtdWE+RFetyx9R8DBGrPX8hjITWD9ZA2bbZZ+J+a39v yY7bNZkCGbWzPGK//O1cInL4Ecmj7Xm8DXjk3E2Xzv1YrZk/GBz9xK8mWXwhn90SHNadEf28 ghMXcmUJSqT+KTxQQjUVaEtQDdzQnYQKh/dHxs760QSAnXkWr0YVYxk8q8aa+G8iAkNJcb+W x5gWEw4ft3HpKMRq74OQvWayy0fXpTlusdnvZs0VVMeRpCW6iCt9UmsbfG6Nyf2MKKbWRJnt jy8mjJiFjiJ2j9s4yNIookRv8IfocULuhnx5FWsvIzX2Vwcd7G5objnY1DlCNQrhJUs/geoC UBjBJp7sfbHakWfTKxZjFsuCXT1dCEN7JXX6ABOshzDTwB0kq7Bq/EkOzPDQGfOPoX2h1KjH uvGWw5cBe8WLnEuhIyf/DWfMS1LbjFB4JlMUEcood5xvE4owpfZog+0a9gpBS6cg9bMgRUex 1C+w3fudJdPQwIRAjJgac0jTT6uDY8re9RhBDv83PRSM7AzxqEFvDj8K46dg1XvJcKs7K5PX pm5Pw4stVEAxIks5uR62wxygImkdvgjQRzJe4JWwAniBWsZG+cNYj6xcItqkupIb4PeOWgNQ QMhGv8DnbAdOOOnumAXWq0+wl5uP
  • Cc: "marmarek@xxxxxxxxxxxxxxxxxxxxxx" <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Tue, 27 Oct 2020 19:14:11 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>



Le 10/27/20 à 4:42 PM, Frédéric Pierret a écrit :


Le 10/27/20 à 10:22 AM, Dario Faggioli a écrit :
On Tue, 2020-10-27 at 06:58 +0100, Jürgen Groß wrote:
On 26.10.20 17:31, Dario Faggioli wrote:

Or did you have something completely different in mind, and I'm
missing
it?

No, I think you are right. I mixed that up with __context_switch()
not
being called.

Right.

Sorry for the noise,

Sure, no problem.

In fact, this issue is apparently scheduler independent. It indeed
seemd to be related to the other report we have "BUG: credit=sched2
machine hang when using DRAKVUF", but there it looks like it is
scheduler-dependant.

Might it be something that lies somewhere else, but Credit2 is
triggering it faster/easier? (Just thinking out loud...)

For Frederic, what happens is that dom0 hangs, right? So you're able to
poke at Xen with some debugkeys (like 'r' for the scheduler's status,
and the ones for the domain's vCPUs)?

If yes, it may be useful to see the output.

Regards


I'm having some new info with respect to your request. Yes dom0 hangs and I can 
interact with serial console. I've succeeded to obtain the output of 'r' 
debug-keys:

```
(XEN) sched_smt_power_savings: disabled
(XEN) NOW=72810702614697
(XEN) Online Cpus: 0-15
(XEN) Cpupool 0:
(XEN) Cpus: 0-15
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)     ncpus              = 16
(XEN)     master             = 0
(XEN)     credit             = 4800
(XEN)     credit balance     = 608
(XEN)     weight             = 12256
(XEN)     runq_sort          = 996335
(XEN)     default-weight     = 256
(XEN)     tslice             = 30ms
(XEN)     ratelimit          = 1000us
(XEN)     credits per msec   = 10
(XEN)     ticks per tslice   = 3
(XEN)     migration delay    = 0us
(XEN) idlers: 00000000,00003c99
(XEN) active units:
(XEN)       1: [0.1] pri=-1 flags=0 cpu=6 credit=214 [w=2000,cap=0]
(XEN)       2: [0.4] pri=-1 flags=0 cpu=8 credit=115 [w=2000,cap=0]
(XEN)       3: [0.5] pri=-1 flags=0 cpu=5 credit=239 [w=2000,cap=0]
(XEN)       4: [0.11] pri=-1 flags=0 cpu=1 credit=-55 [w=2000,cap=0]
(XEN)       5: [0.6] pri=-2 flags=0 cpu=15 credit=-177 [w=2000,cap=0]
(XEN)       6: [0.7] pri=-1 flags=0 cpu=2 credit=50 [w=2000,cap=0]
(XEN)       7: [19.1] pri=-2 flags=0 cpu=9 credit=-241 [w=256,cap=0]
(XEN) CPUs info:
(XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
(XEN) CPU[00] nr_run=0, sort=996334, sibling={0}, core={0-7}
(XEN) CPU[01] current=d0v11, curr=d0v11, prev=NULL
(XEN) CPU[01] nr_run=1, sort=996335, sibling={1}, core={0-7}
(XEN)     run: [0.11] pri=-1 flags=0 cpu=1 credit=-55 [w=2000,cap=0]
(XEN)       1: [32767.1] pri=-64 flags=0 cpu=1
(XEN) CPU[02] current=d0v7, curr=d0v7, prev=NULL
(XEN) CPU[02] nr_run=1, sort=996335, sibling={2}, core={0-7}
(XEN)     run: [0.7] pri=-1 flags=0 cpu=2 credit=50 [w=2000,cap=0]
(XEN)       1: [32767.2] pri=-64 flags=0 cpu=2
(XEN) CPU[03] current=d[IDLE]v3, curr=d[IDLE]v3, prev=NULL
(XEN) CPU[03] nr_run=0, sort=996329, sibling={3}, core={0-7}
(XEN) CPU[04] current=d[IDLE]v4, curr=d[IDLE]v4, prev=NULL
(XEN) CPU[04] nr_run=0, sort=996325, sibling={4}, core={0-7}
(XEN) CPU[05] current=d0v5, curr=d0v5, prev=NULL
(XEN) CPU[05] nr_run=1, sort=996334, sibling={5}, core={0-7}
(XEN)     run: [0.5] pri=-1 flags=0 cpu=5 credit=239 [w=2000,cap=0]
(XEN)       1: [32767.5] pri=-64 flags=0 cpu=5
(XEN) CPU[06] current=d0v1, curr=d0v1, prev=NULL
(XEN) CPU[06] nr_run=1, sort=996334, sibling={6}, core={0-7}
(XEN)     run: [0.1] pri=-1 flags=0 cpu=6 credit=214 [w=2000,cap=0]
(XEN)       1: [32767.6] pri=-64 flags=0 cpu=6
(XEN) CPU[07] current=d[IDLE]v7, curr=d[IDLE]v7, prev=NULL
(XEN) CPU[07] nr_run=0, sort=996303, sibling={7}, core={0-7}
(XEN) CPU[08] current=d[IDLE]v8, curr=d[IDLE]v8, prev=NULL
(XEN) CPU[08] nr_run=2, sort=996335, sibling={8}, core={8-15}
(XEN)       1: [0.4] pri=-1 flags=0 cpu=8 credit=115 [w=2000,cap=0]
(XEN) CPU[09] current=d19v1, curr=d19v1, prev=NULL
(XEN) CPU[09] nr_run=1, sort=996335, sibling={9}, core={8-15}
(XEN)     run: [19.1] pri=-2 flags=0 cpu=9 credit=-241 [w=256,cap=0]
(XEN)       1: [32767.9] pri=-64 flags=0 cpu=9
(XEN) CPU[10] current=d[IDLE]v10, curr=d[IDLE]v10, prev=NULL
(XEN) CPU[10] nr_run=0, sort=996334, sibling={10}, core={8-15}
(XEN) CPU[11] current=d[IDLE]v11, curr=d[IDLE]v11, prev=NULL
(XEN) CPU[11] nr_run=0, sort=996331, sibling={11}, core={8-15}
(XEN) CPU[12] current=d[IDLE]v12, curr=d[IDLE]v12, prev=NULL
(XEN) CPU[12] nr_run=0, sort=996333, sibling={12}, core={8-15}
(XEN) CPU[13] current=d[IDLE]v13, curr=d[IDLE]v13, prev=NULL
(XEN) CPU[13] nr_run=0, sort=996334, sibling={13}, core={8-15}
(XEN) CPU[14] current=d0v14, curr=d0v14, prev=NULL
(XEN) CPU[14] nr_run=1, sort=990383, sibling={14}, core={8-15}
(XEN)     run: [0.14] pri=0 flags=0 cpu=14 credit=-514 [w=2000,cap=0]
(XEN)       1: [32767.14] pri=-64 flags=0 cpu=14
(XEN) CPU[15] current=d0v6, curr=d0v6, prev=NULL
(XEN) CPU[15] nr_run=1, sort=996335, sibling={15}, core={8-15}
(XEN)     run: [0.6] pri=-2 flags=0 cpu=15 credit=-177 [w=2000,cap=0]
(XEN)       1: [32767.15] pri=-64 flags=0 cpu=15
```

I attempt to get '*' but that blocked my serial console, at least I was not 
able to interact with it few minutes later. I'll try to get other info too. 
I've also uploaded the piece of this huge '*' dump here: 
https://gist.github.com/fepitre/36923fbc08cc2fd8bdb59b81e73a6c2e

Regards

Ok the server got frozen just few minutes after my mail and I got now:
'r': https://gist.github.com/fepitre/78541f555902275d906d627de2420571
'q': https://gist.github.com/fepitre/0ddf6b5e8fdb3152d24337d83fdc345e
'I': https://gist.github.com/fepitre/50c68233d08ad1e495edf7e0e146838b

Tell me if I can provide any other info from serial console.

Regards,
Frédéric

Attachment: OpenPGP_0x484010B5CDC576E2.asc
Description: application/pgp-keys

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.