[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 00/26] Runtime paravirt patching

So, first thanks for the quick comments even though some of my choices
were straight NAKs (or maybe because of that!)

Second, I clearly did a bad job of motivating the series. Let me try
to address the motivation comments first and then I can address the
technical concerns separately.

[ I'm collating all the motivation comments below. ]

A KVM host (or another hypervisor) might advertise paravirtualized
features and optimization hints (ex KVM_HINTS_REALTIME) which might
become stale over the lifetime of the guest. For instance, the

Thomas> If your host changes his advertised behaviour then you want to
Thomas> fix the host setup or find a competent admin.

Juergen> Then this hint is wrong if it can't be guaranteed.

I agree, the hint behaviour is wrong and the host shouldn't be giving
hints it can only temporarily honor.
The host problem is hard to fix though: the behaviour change is
either because of a guest migration or in case of a hosted guest,
cloud economics -- customers want to go to a 2-1 or worse VCPU-CPU
ratio at times of low load.

I had an offline discussion with Paolo Bonzini where he agreed that
it makes sense to make KVM_HINTS_REALTIME a dynamic hint rather than
static as it is now. (That was really the starting point for this

host might go from being undersubscribed to being oversubscribed
(or the other way round) and it would make sense for the guest
switch pv-ops based on that.

Juergen> I think using pvops for such a feature change is just wrong.
Juergen> What comes next? Using pvops for being able to migrate a guest
Juergen> from an Intel to an AMD machine?

My statement about switching pv-ops was too broadly worded. What
I meant to say was that KVM guests choose pv_lock_ops to be native
or paravirt based on undersubscribed/oversubscribed hint at boot,
and this choice should be available at run-time as well.

KVM chooses between native/paravirt spinlocks at boot based on this
reasoning (from commit b2798ba0b8):
"Waiman Long mentioned that:
Generally speaking, unfair lock performs well for VMs with a small
number of vCPUs. Native qspinlock may perform better than pvqspinlock
if there is vCPU pinning and there is no vCPU over-commitment.

PeterZ> So what, the paravirt spinlock stuff works just fine when
PeterZ> you're not oversubscribed.
Yeah, the paravirt spinlocks work fine for both under and oversubscribed
hosts, but they are more expensive and that extra cost provides no benefits
when CPUs are pinned.
For instance, pvqueued spin_unlock() is a call+locked cmpxchg as opposed
to just a movb $0, (%rdi).

This difference shows up in kernbench running on a KVM guest with native
and paravirt spinlocks. I ran with 8 and 64 CPU guests with CPUs pinned.

The native version performs same or better.

8 CPU       Native  (std-dev)  Paravirt (std-dev)
            -----------------  -----------------
-j  4: sys  151.89  ( 0.2462)  160.14   ( 4.8366)    +5.4%
-j 32: sys  162.715 (11.4129)  170.225  (11.1138)    +4.6%
-j  0: sys  164.193 ( 9.4063)  170.843  ( 8.9651)    +4.0%

64 CPU       Native  (std-dev)  Paravirt (std-dev)
            -----------------  -----------------
-j  32: sys 209.448 (0.37009)  210.976   (0.4245)    +0.7%
-j 256: sys 267.401 (61.0928)  285.73   (78.8021)    +6.8%
-j   0: sys 286.313 (56.5978)  307.721  (70.9758)    +7.4%

In all cases the pv_kick, pv_wait numbers were minimal as expected.
The lock_slowpath counts were higher with PV but AFAICS the native
and paravirt lock_slowpath are not directly comparable.

Detailed kernbench numbers attached.


Attachment: 8-cpus.txt
Description: Text document

Attachment: 64-cpus.txt
Description: Text document



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.