[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl: Make 'xl vcpu-set' work properly on overcommited hosts.



On Wed, 2013-05-08 at 23:39 +0100, Konrad Rzeszutek Wilk wrote:
> > > > > Well, overcommit comes in mind. Say you migrate to a 4PCPU box and you
> > > > > have 12VCPUs, then you decide to go down to 4, then back to 16 before
> > > > > migrating it to some other box. Can't do.
> > > > 
> > > > You could do it *after* the migration back to a 16 way box n stead of
> > > > before though, which is most likely when you would actually want to do
> > > > it...
> > > 
> > > I am kind of lost. Are we arguing for this being a bug or whether there is
> > > justification for putting in Xen 4.3?
> > 
> > The former needs deciding before the latter.
> > 
> > I'm not convinced that the current xl behaviour of refusing to
> > overcommit VCPUs on a host isn't the right one for the majority of use
> > cases. Obviously the silently refusing bit is a bug which should be
> > fixed.
> > 
> > I don't buy that this is a "regression compared to Xend". It's certainly
> > a difference from how xend behaved but it seems on the whole to be a
> > positive one (i.e. xend was wrong).
> 
> CC-ing Juergen here as he added this in.
> > 
> > Can you explain the use case for wanting to do this? I don't think the
> > migration one you give above is very convincing since a normal user
> > wouldn't want to overcommit on the source host, they would want to
> > migrate and then increase the number of vcpus, without ever
> > overcommitting, and therefore without the terrible performance of
> > overcommitting.
> 
> It seems clear to me if a user wants to over-commit, then we should
> allow the user to do so. If we provide a command to set X vCPUs it
> should work as described - without the extra checks (unless
> that is enabled by some other option).
> 
> That is currently not the case and the documentation does not mention
> why or why the choice to limit the amount was implemented  - and looking
> back at the changeset that introduced this: c/s 22918 (CC Juergen here)
> it looks as the original behavior was to do it as Xend does.

I'm afraid that doesn't make it correct.

> If we want a behavior where we don't allow to overcommit (which sounds
> bogus - that is part allure of virtualization)

Note that I'm not against overcommitting the PCPUS on the host in
general, just against the idea that giving a single VM more VCPUs than
PCPUs. Giving a single VM more VCPUs than the host has PCPUs has only
negative effects.

>  then we should also
> tweak other 'xl' commands. For example you can do a similar scenario in
> which you launch say sixteen 1VCPU guests and you get the same performance
> characteristics as if you launch one guest with 16VCPUs on a 4PCPU machine.
> Or perhaps worst.

The latter case is much worse and that's the only one I'm arguing we
should avoid.

In the 16VMsx1VCPUs sharing 4PCPUs case those 16 VCPUs can be idling etc
and so you can quite possibly pack them onto 4 PCPUs. Or if they are
busy then they get timesliced in a fair way etc. All good and proper and
yes this is part of the allure of virtualisation.

If you have 1VMx16VCPUs sharing 4CPUs then this is not the same
situation. That guest can only ever see 4VCPUs running simultaneously,
so at most it can only ever use 400% of a CPU, all the other 12VCPUs do
is divide that 400% more thinly, cause scheduling contention, locking
overhead (especially internally to the guest) and other negative
impacts. The VM is contending with itself for no benefit.

If you have a scenario where running a 16VCPU VM on a 4PCPU system is
beneficial to the guest (or perhaps the host) then please tell us what
it is, because that would indeed be a useful justification for allowing
this behaviour by default.

The only scenario which has been mentioned so far in this thread is
deliberately provoking the kinds of bad behaviours I've mentioned above,
for testing and development purposes (i.e. exposing guest locking
contention), which is the main reason I'm willing to consider an
override to allow this, but this isn't an argument for allowing it by
default.

> Or say you launch a guest with sixteen VCPUs without trouble right now, but
> then if you decrease to one and then want to go back to sixteen it refuses
> to do so (without telling you why).

I think it is a bug that xl lets you silently create a 16VCPU guest on a
4PCPU host without at least warning you or better requiring you to say
"yes I mean it".

> But if tells you why (say by adding a warning that says "You don't want to
> overcommit") then the user will ask two questions (I would :-)):
>  a) Why didn't xl create complain then initially?
>  b) Why can't I overcommit? I want to and this 'xl' is refusing me to do it!
>  c) It worked with xend!

Please stop making argument c). The purpose of xl's xm compatibility is
to offer an upgrade path, it is not to copy every possible misbehaviour
which xend had.

Yes, this is your mother's classic "if xend jumped off a bridge"
argument.

> > A compromise might be a non-default option to allow users to force
> > overcommit but to otherwise deny it.
> 
> That can be done. But I would turn it around. Allow it by default and
> provide another option (perhaps a default global in /etc/xen/xl.conf)
> that will disable overcommiting as much as possible.
> 
> And also why don't we want overcommiting?

Because the sort of overcommit we are talking about here has only
negative effects on the guest.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.