[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Slow (higher scheduling latency?) system after S3 - regression 4.17 -> 4.19 ?


  • To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 2 Apr 2026 10:23:50 +0200
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=google header.d=suse.com header.i="@suse.com" header.h="Content-Transfer-Encoding:In-Reply-To:Autocrypt:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jürgen Groß <jgross@xxxxxxxx>
  • Delivery-date: Thu, 02 Apr 2026 08:23:55 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.03.2026 13:47, Marek Marczykowski-Górecki wrote:
> Some further observation, this time regarding timers:
> 
> Just before S3:
> 
>     (XEN) [  150.264757] Dumping timer queues:
>     (XEN) [  150.265181] CPU00:
>     (XEN) [  150.265497]   ex=         736us timer=ffff82d040643540 
> cb=drivers/char/xhci-dbc.c#dbc_uart_poll(ffff82d0404e5e78)
>     (XEN) [  150.266590]   ex=      460377us timer=ffff82d04065d2c0 
> cb=arch/x86/time.c#time_calibration(0000000000000000)
>     (XEN) [  150.267636]   ex=     1377460us timer=ffff82d04065d220 
> cb=arch/x86/time.c#plt_overflow(0000000000000000)
>     (XEN) [  150.268651]   ex=    13525100us timer=ffff82d040644c20 
> cb=arch/x86/cpu/mcheck/intel-nonfatal.c#mce_work_fn(0000000000000000)
>     (XEN) [  150.269820] CPU02:
>     (XEN) [  150.270126]   ex=      616653us timer=ffff830461931070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461931000)
>     (XEN) [  150.271276] CPU04:
>     (XEN) [  150.271577]   ex=      630577us timer=ffff830461935070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461935000)
>     (XEN) [  150.272734]   ex=      634586us timer=ffff830461921070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461921000)
>     (XEN) [  150.273884] CPU06:
>     (XEN) [  150.274184]   ex=       44474us timer=ffff830461905070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461905000)
>     (XEN) [  150.275334] CPU08:
>     (XEN) [  150.275633]   ex=       11465us timer=ffff83046190d070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff83046190d000)
>     (XEN) [  150.276783] CPU09:
>     (XEN) [  150.277083]   ex=       12465us timer=ffff830461909070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461909000)
>     (XEN) [  150.278236] CPU10:
>     (XEN) [  150.278536]   ex=       14464us timer=ffff830461901070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461901000)
>     (XEN) [  150.279686] CPU11:
>     (XEN) [  150.279986]   ex=       15464us timer=ffff830461911070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461911000)
>     (XEN) [  150.281139] CPU12:
>     (XEN) [  150.281438]   ex=       17463us timer=ffff830461919070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461919000)
>     (XEN) [  150.282592] CPU13:
>     (XEN) [  150.282890]   ex=      347090us timer=ffff830461941070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461941000)
>     (XEN) [  150.284040] CPU14:
>     (XEN) [  150.284341]   ex=       20462us timer=ffff83046191d070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff83046191d000)
>     (XEN) [  150.285496] CPU15:
>     (XEN) [  150.285806]   ex=         470us timer=ffff83046192d070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff83046192d000)
>     (XEN) [  150.286972]   ex=       15316us timer=ffff8304619cdaa0 
> cb=drivers/cpufreq/cpufreq_ondemand.c#do_dbs_timer(ffff8304619cdae0)
>     (XEN) [  150.288156]   ex=        6230us timer=ffff8304619e8ed0 
> cb=common/sched/core.c#s_timer_fn(0000000000000000)
>     (XEN) [  150.289192] CPU16:
>     (XEN) [  150.289492]   ex=      633594us timer=ffff830461929070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461929000)
>     (XEN) [  150.290642] CPU18:
>     (XEN) [  150.290942]   ex=      614601us timer=ffff830461939070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461939000)
>     (XEN) [  150.292094] CPU20:
>     (XEN) [  150.292394]   ex=       27459us timer=ffff830461915070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461915000)
>     (XEN) [  150.293553]   ex=       55316us timer=ffff830461979aa0 
> cb=drivers/cpufreq/cpufreq_ondemand.c#do_dbs_timer(ffff830461979ae0)
>     (XEN) [  150.294744]   ex=       35413us timer=ffff83046198fdc0 
> cb=common/sched/core.c#s_timer_fn(0000000000000000)
>     (XEN) [  150.295776] CPU21:
>     (XEN) [  150.296076]   ex=      617580us timer=ffff830461925070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461925000)
> 
> Just after S3:
> 
>     (XEN) [  164.462198] Dumping timer queues:
>     (XEN) [  164.462630] CPU00:
>     (XEN) [  164.462949]   ex=         768us timer=ffff82d040643540 
> cb=drivers/char/xhci-dbc.c#dbc_uart_poll(ffff82d0404e5e78)
>     (XEN) [  164.464064]   ex=     1286532us timer=ffff82d04065d220 
> cb=arch/x86/time.c#plt_overflow(0000000000000000)
>     (XEN) [  164.465099]   ex=      395997us timer=ffff830461921070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461921000)
>     (XEN) [  164.466262]   ex=    15327843us timer=ffff82d040644c20 
> cb=arch/x86/cpu/mcheck/intel-nonfatal.c#mce_work_fn(0000000000000000)
>     (XEN) [  164.467461] CPU02:
>     (XEN) [  164.467773]   ex=       37756us timer=ffff830461931070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461931000)
>     (XEN) [  164.468953] CPU04:
>     (XEN) [  164.469267]   ex=      940042us timer=ffff830461935070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461935000)
>     (XEN) [  164.470444] CPU06:
>     (XEN) [  164.470759]   ex=      485896us timer=ffff830461905070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461905000)
>     (XEN) [  164.471937] CPU08:
>     (XEN) [  164.472251]   ex=      459720us timer=ffff83046190d070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff83046190d000)
>     (XEN) [  164.473430] CPU09:
>     (XEN) [  164.473745]   ex=     1563081us timer=ffff830461909070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461909000)
>     (XEN) [  164.474922] CPU10:
>     (XEN) [  164.475237]   ex=      948204us timer=ffff830461901070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461901000)
>     (XEN) [  164.476417] CPU11:
>     (XEN) [  164.476732]   ex=      948071us timer=ffff830461911070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461911000)
>     (XEN) [  164.477912] CPU12:
>     (XEN) [  164.478219]   ex=      892163us timer=ffff830461919070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461919000)
>     (XEN) [  164.479398] CPU13:
>     (XEN) [  164.479713]   ex=      308429us timer=ffff830461941070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461941000)
>     (XEN) [  164.480893] CPU14:
>     (XEN) [  164.481208]   ex=      954493us timer=ffff83046191d070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff83046191d000)
>     (XEN) [  164.482386] CPU15:
>     (XEN) [  164.482700]   ex=     1487692us timer=ffff83046192d070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff83046192d000)
>     (XEN) [  164.483881] CPU16:
>     (XEN) [  164.484195]   ex=       22890us timer=ffff830461929070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461929000)
>     (XEN) [  164.485373] CPU18:
>     (XEN) [  164.485688]   ex=       85902us timer=ffff830461939070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461939000)
>     (XEN) [  164.486864] CPU20:
>     (XEN) [  164.487171]   ex=         895us timer=ffff830461915070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461915000)
>     (XEN) [  164.488358]   ex=       17895us timer=ffff830461979aa0 
> cb=drivers/cpufreq/cpufreq_ondemand.c#do_dbs_timer(ffff830461979ae0)
>     (XEN) [  164.489569]   ex=        2914us timer=ffff83046198fdc0 
> cb=common/sched/core.c#s_timer_fn(0000000000000000)
>     (XEN) [  164.490633] CPU21:
>     (XEN) [  164.490947]   ex=      461154us timer=ffff830461925070 
> cb=common/sched/core.c#vcpu_singleshot_timer_fn(ffff830461925000)
> 
> In the latter case, I see only one cb=common/sched/core.c#s_timer_fn
> (two of them before). And also only one
> cb=drivers/cpufreq/cpufreq_ondemand.c#do_dbs_timer.

Let's start with this one. It's active when the ondemand governor is in use,
and when a CPU isn't (in the process of being put into) a C-state. There
being only two instances means the system was reasonably idle at the time you
dumped the timer queues ahead of entering S3, and just slightly more idle
after getting back out of S3. I've just checked my Skylake: Without it ever
having gone through an S3 cycle, there's only one instance there when the
system is running nothing but an "xl dmesg".

If you place enough load on the system to keep all CPUs busy, more instances
of the timer should appear. How many of them it'll be does, however, also
depend on the scope of the policy (supplied by firmware). The topology ...

> Is the s_timer_fn related to runqueues? I have two of them:
> 
>     (XEN) [   11.751480] Adding cpu 0 to runqueue 0
>     (XEN) [   11.753580]  First cpu on runqueue, activating
>     (XEN) [   11.755731] Adding cpu 2 to runqueue 0
>     (XEN) [   11.757818] Adding cpu 4 to runqueue 0
>     (XEN) [   11.759891] Adding cpu 6 to runqueue 0
>     (XEN) [   11.761955] Adding cpu 8 to runqueue 0
>     (XEN) [   11.764001] Adding cpu 9 to runqueue 0
>     (XEN) [   11.766029] Adding cpu 10 to runqueue 0
>     (XEN) [   11.768045] Adding cpu 11 to runqueue 0
>     (XEN) [   11.770050] Adding cpu 12 to runqueue 1
>     (XEN) [   11.772033]  First cpu on runqueue, activating
>     (XEN) [   11.774051] Adding cpu 13 to runqueue 1
>     (XEN) [   11.776047] Adding cpu 14 to runqueue 1
>     (XEN) [   11.777993] Adding cpu 15 to runqueue 1
>     (XEN) [   11.779913] Adding cpu 16 to runqueue 1
>     (XEN) [   11.781872] Adding cpu 18 to runqueue 1
>     (XEN) [   11.783912] Adding cpu 20 to runqueue 1
>     (XEN) [   11.785855] Adding cpu 21 to runqueue 1

... of this system looks somewhat "interesting", judging from these CPU
numbers.

Assuming the system was (supposedly) equally idle before and after S3,
the absence of the 2nd timer(s?) may indicate that some (background?)
load that was there before S3 didn't re-appear afterwards. Such
background load could be something which monitors and tunes the system.
(Yet as indicated, I'm looking at the CPUfreq part only right now. I'm
kind of hoping that a scheduler maintainer may look at the other aspect.)

> Could the issue be related to those timer differences?

Possibly. For the CPUfreq aspect you could simply turn that off to see if
it makes a difference. Or you could switch to a governor different from
the ondemand one.

With the ondemand governor, does the pattern (will require multiple runs)
change across S3 that "xenpm get-cpufreq-states" produces? For example,
does load move towards CPU0? (Thinking of it, did you check whether
interrupt affinities change across S3? More going to CPU0 could also be
part of the problem, for example.)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.