[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v2][PATCH 1/3] docs: design and intended usage for NUMA-aware ballooning



>>> On 16.08.13 at 06:13, Yechen Li <lccycc123@xxxxxxxxx> wrote:
> +So, in NUMA aware ballooning, ballooning down and up works as follows:
> +
> +* target < current usage -- first of all, the ballooning driver uses the
> +  PNODE\_TO\_VNODE() service (provided by the virtual topology 
> implementation,
> +  as explained above) to translate _pnid_ (that it reads from xenstore) to
> +  the id(s) of the corresponding set of vnode IDs, say _{vnids}_ (which will

This looks conceptually wrong: The balloon driver should have no
need to know about pNID-s; it should be the tool stack doing the
translation prior to writing the xenstore node.

Further, the new xenstore node would presumably better be a mask
than a single vNID, since in order to e.g. balloon up another guest
already spanning multiple nodes, giving the tool stack a way to ask
for memory on any of the spanned nodes.

And finally, coming back what Tim had already pointed out - doing
things the way you propose can cause an imbalance in the
ballooned down guest, penalizing it in favor of not penalizing the
intended consumer of the recovered memory. Therefore I wonder
whether, without any new xenstore node, it wouldn't be better to
simply require conforming balloon drivers to balloon out memory
evenly across the domain's virtual nodes.

> +The biggest difference between current and NUMA-aware ballooning is that the
> +latter needs to keep multiple lists of the ballooned pages in an array, with
> +one element for each virtual node. This way, it is always evident, at any
> +given time, what ballooned pages belong to what vnode.

That's wrong afaict: ballooned out pages aren't associated with any
memory, and hence can't be associated with any vNID. Once they
get re-populated, which vNID the memory belongs to is an attribute
of the memory coming in, not the control structure that it's to be
associated with.

I believe this thinking of yours stems from the fact that in Linux the
page control structures are associated with nodes by way of the
physical memory map being split into larger pieces, each coming from
a particular node. But other OSes don't need to follow this model,
and what you propose would also exclude extending the spanned
nodes set if memory gets ballooned in that's not associated with
any node the domain so far was "knowing" of.

> +Regarding the stealing a page from the OS part, it is enough to use the Linux
> +function alloc_page_node(), in place of alloc\_page().

Such statement seems to confirm that you're thinking Linux centric
instead of defining a generic model.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.