[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] questions about ballooning



On Sun, 2007-11-04 at 10:34 -0500, weiming wrote:
> Hi Daniel,
> 
> Very appreciate for your so detailed explanation. You clarified some
> of my confusion. Before I post my question here, I read the paper
> "Memory resource Management in VMware ESX server", "art of
> virtualization", read Xen manual, checked the xen Wiki and searched in
> the maillist archive but can't get a complete  picture about balloon. 
> 
> 1) when a guesting os starts, how does it determine the amount of
> physical memory? i.e. which value determines the number of entries in
> mem_map? Is the value specified in configuration file?  

one page of initial domain memory is dedicated to a 'start_info'
structure. you may grep the xen/include sources for the definition.

then see e.g. linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c

as i believe you already understood, ther are two important distinctions
here:
- 'nr_pages': the size of the physical address range which the domain
  can use. that's basically the maximum memory, different
  from what the domain actually gets.
- 'reservation': the amount in nr_pages actually filled with machine
  memory.

nr_pages is in start info, as is the frame list corresponding to the
initial reservation set by the domain builder. the domain builder takes
gets nr_pages from the memory= field in the configuration file.

not sure about how this bootstraps in the balloon. e.g. i'm not sure
whether the whole initial memory is allocated and then returned again
only upon demand. or if the initial reservation is full memory and then
only grown by the balloon. i believe the former is the case. maybe
someone else can comment (please).

> 2) what's the exact role that xm mem-max plays?  I can set it to be
> higher than the value in configuration file.  I think that it just
> sets the "new_target" to balloon via xenbus or /proc/xen/balloon,
> right? 

you can tell the domain its physical limit is 1G. that's e.g. what the
guest storage allocator then uses to initialize itself. but you can as
well go afterwards and modify a configurable limit below the hard limit.
its then up to the balloon to wring the memory out of the guest system. 

why higher values get accepted i cannot comment on. maybe clipped
without further comment?

see, the kernel can free a lot of memory even when it is 'in use' by
swapping it to disk. that's one of the basic ideas of having a balloon
driver: do not build your own page replacement, but put put pressure on
the existing guest memory management to do it for you. that is what the
call to alloc_page() in the balloon driver is essentially doing. otoh,
memory in use be the kernel cannot be swapped. that's why the pages
grabbed by the balloon itself remain safe. that memory must be locked.

but, again, i should admit that my own understanding gets a bit fuzzy
here, regarding which is which in the config file and within xm
parameters. you're right in that the communication is performed via
xenbus. i spent more time reading xen and kernel code than the
surrounding python source. maybe someone else can comment better or
(hopefully) correct me if i'm talking rubbish somewhere above. 

send me an update once you hit it. :)

> 3) Once some pages are "ballooned out", these pages will be utilized
> by other domains, so if we later try to restore to initial status, how
> does VMM find available pages? 

the memory gets balanced by asking other domains to decrease their
reservation. you can wrench down the domain to the bare kernel when
you're a driver. the kernel gives you anything you ask for --
physically. or, rather, until the linux oom killer kicks in, a rather
undesirable side effect, as a recent thread on this list discussed.

> In increase_reservation(), 
> ...
> rc = HYPERVISOR_ memory_op(XENMEM_populate_physmap, &reservation)
> if (rc < nr_pages)
>  ...
> 
> In my understanding, hypervisor *tries* to find some free pages to
> return to the os. 

yes, this can fail. there's no fundamental (i.e. consistency) problem in
failing. the kernel will find that all memory is in use, as it will on a
native system if something grabbed all the memory. so, there's a balloon
driver saying "admittedly, i'm presently sitting on 80% of all memory,
but now it's mine. you modprobed me, so you trust me, now go look
somewhere else". a native driver would not even have been asked.

> 4)  in balloon.c, there are some functions that I can't find the
> calling sites. they are dealloc_pte_fn, alloc_empty_pages_and_pagevec,

no idea.

>  balloon_update_driver_allowance,

this one i can explain. there's memory apart from balloon entering and
leaving the domU. that's 'I/O memory' which is moved between frontend
and backend drivers to transfer data. for both receiving and sending,
domU is required to take memory from its own reservation. so it hands
these pages over to the backend driver domain and gets them back not
before there backend is finished with the transfer (i.e. map/unmapping
similar to ballooning). the balloon driver accounts for this memory, so
frontends call this function to tell her about it.

>  etc. Are they be called back by hypervisor?

they are certainly not immediate callback functions. control transfers
in/to the guest, if they are initated by the hypervisor, are always done
via event channels. the only other path would be an iret. that means
there's no 'call' between xen and guests. those symbols you see must be
in use somewhere.

regards,
daniel

-- 
Daniel Stodden
LRR     -      Lehrstuhl fÃr Rechnertechnik und Rechnerorganisation
Institut fÃr Informatik der TU MÃnchen             D-85748 Garching
http://www.lrr.in.tum.de/~stodden         mailto:stodden@xxxxxxxxxx
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33  3D80 457E 82AE B0D8 735B



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.