[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path

To: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>
From: Quan Xu <quan.xu0@xxxxxxxxx>
Date: Mon, 20 Nov 2017 15:05:01 +0800
Cc: Yang Zhang <yang.zhang.wz@xxxxxxxxx>, Len Brown <len.brown@xxxxxxxxx>, kvm@xxxxxxxxxxxxxxx, linux-doc@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, LKML <linux-kernel@xxxxxxxxxxxxxxx>, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, Kyle Huey <me@xxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Quan Xu <quan.xu03@xxxxxxxxx>, Andy Lutomirski <luto@xxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, Tom Lendacky <thomas.lendacky@xxxxxxx>, Tobias Klauser <tklauser@xxxxxxxxxx>
Delivery-date: Mon, 20 Nov 2017 07:05:34 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>



On 2017-11-16 17:45, Daniel Lezcano wrote:

On 16/11/2017 10:12, Quan Xu wrote:


On 2017-11-16 06:03, Thomas Gleixner wrote:

On Wed, 15 Nov 2017, Peter Zijlstra wrote:

On Mon, Nov 13, 2017 at 06:06:02PM +0800, Quan Xu wrote:

From: Yang Zhang <yang.zhang.wz@xxxxxxxxx>

Implement a generic idle poll which resembles the functionality
found in arch/. Provide weak arch_cpu_idle_poll function which
can be overridden by the architecture code if needed.

No, we want less of those magic hooks, not more.

Interrupts arrive which may not cause a reschedule in idle loops.
In KVM guest, this costs several VM-exit/VM-entry cycles, VM-entry
for interrupts and VM-exit immediately. Also this becomes more
expensive than bare metal. Add a generic idle poll before enter
real idle path. When a reschedule event is pending, we can bypass
the real idle path.

Why not do a HV specific idle driver?

If I understand the problem correctly then he wants to avoid the heavy
lifting in tick_nohz_idle_enter() in the first place, but there is
already
an interesting quirk there which makes it exit early.  See commit
3c5d92a0cfb5 ("nohz: Introduce arch_needs_cpu"). The reason for this
commit
looks similar. But lets not proliferate that. I'd rather see that go
away.

agreed.

Even we can get more benifit than commit 3c5d92a0cfb5 ("nohz: Introduce
arch_needs_cpu")
in kvm guest. I won't proliferate that..

But the irq_timings stuff is heading into the same direction, with a more
complex prediction logic which should tell you pretty good how long that
idle period is going to be and in case of an interrupt heavy workload
this
would skip the extra work of stopping and restarting the tick and
provide a
very good input into a polling decision.


interesting. I have tested with IRQ_TIMINGS related code, which seems
not working so far.

I don't know how you tested it, can you elaborate what you meant by
"seems not working so far" ?

Daniel, I tried to enable IRQ_TIMINGS* manually. usedirq_timings_next_event()

to return estimation of the earliest interrupt. However I got a constant.

There are still some work to do to be more efficient. The prediction
based on the irq timings is all right if the interrupts have a simple
periodicity. But as soon as there is a pattern, the current code can't
handle it properly and does bad predictions.

I'm working on a self-learning pattern detection which is too heavy for
the kernel, and with it we should be able to detect properly the
patterns and re-ajust the period if it changes. I'm in the process of
making it suitable for kernel code (both math and perf).

One improvement which can be done right now and which can help you is
the interrupts rate on the CPU. It is possible to compute it and that
will give an accurate information for the polling decision.

As tglx said, talk to each other / work together to make it usable forall use cases.could you share how to enable it to get the interrupts rate on the CPU?I can try it

in cloud scenario. of course, I'd like to work with you to improve it.

Quan
Alibaba Cloud


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
  - From: Daniel Lezcano

References:
- [Xen-devel] [PATCH RFC v3 0/6] x86/idle: add halt poll support
  - From: Quan Xu
- [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
  - From: Quan Xu
- Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
  - From: Peter Zijlstra
- Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
  - From: Thomas Gleixner
- Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
  - From: Quan Xu
- Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
  - From: Daniel Lezcano

Prev by Date: [Xen-devel] [linux-linus bisection] complete test-amd64-amd64-pygrub
Next by Date: [Xen-devel] [xen-unstable test] 116337: regressions - FAIL
Previous by thread: Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
Next by thread: Re: [Xen-devel] [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.