[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Xen balloon driver discuss



No, you're confusing two things.  pod_entries is the number of entries
in the p2m table that have neither been populated with memory, nor been
reclaimed by the balloon driver.

Are you sure the balloon driver is actually working?

Chu: Yes, the PoD "cache" is the memory pool which is used to populate
PoD entries.  "Cache" is a bad name, I should have called it "pool" to
begin with.

 -George

On 29/11/10 06:34, xiaoyun.maoxy wrote:
> Hi George:
> 
> I read 
> http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html 
> more carefully, and got my print out of
> 
> first call of p2m_pod_demand_populate(), which is :
> 
> houyi-chunk2.dev.sd.aliyun.com login: blktap_sysfs_create: adding 
> attributes for dev ffff880122466400
> 
> (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! 
> tot_pages 132088 pod_entries 523776
> 
> And memory/target under /local/domain/1/ is 524288.
> 
> So 523776 is less than 524288, I think the problem is similar, right?
> 
> But the question is why the patch doesn’t work for me.
> 
> Many thanks.
> 
> *From:* tinnycloud [mailto:tinnycloud@xxxxxxxxxxx]
> *Date:* 2010年11月29日 12:21
> *To:* 'Dan Magenheimer'; 'xen devel'
> *CC:* 'george.dunlap@xxxxxxxxxxxxx'
> *Subject:* re: Xen balloon driver discuss
> 
> Hi Dan:
> 
> You are right, the HVM guest is kernel-2.6.18-164.el5.src.rpm, coming 
> from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/
> 
> Currently the balloon driver is compiled from this kernel. (So I am 
> afraid of if the driver may out of date, and I plan to get new balloon.c 
> from xenlinux and put it into this kernel to compile a new xen-balloon.ko)
> 
> My xen is 4.0.0, again pvops kernel 2.6.31
> 
> Actually, I have two problems, first is PoD “populate-on-demand memory” 
> issue, and second is xen panic(I will get more test and report on 
> another reply)
> 
> I have googled some and apply the patch from 
> http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html, but 
> it doesn’t work for me.
> 
> -------------------------------------------------Domain Crash 
> Case---------------------------------------------
> 
> The issue is easy to reproduce, I started one HVM with command line:
> 
> xm cr hvm.linux.balloon maxmem=2048 memory=512
> 
> the guest works well at first, but crashed as long as I logined into it 
> throught VNC
> 
> the serial output is:
> 
> blktap_sysfs_create: adding attributes for dev ffff8801224df000
> 
> (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! 
> tot_pages 132088 pod_entries 9489
> 
> (XEN) domain_crash called from p2m.c:1127
> 
> (XEN) Domain 4 reported crashed by domain 0 on cpu#0:
> 
> (XEN) printk: 31 messages suppressed.
> 
> (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff 
> (domain 4)
> 
> blktap_sysfs_destroy
> 
> blktap_sysfs_create: adding attributes for dev ffff88012259ca00
> 
> -------------------------------------------------Xen Crash 
> Case---------------------------------------------
> 
> In addition, if start guest like
> 
> m cr hvm.linux.balloon maxmem=2048 memory=400
> 
> blktap_sysfs_destroy
> 
> blktap_sysfs_create: adding attributes for dev ffff8801224df000
> 
> (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! 
> tot_pages 132088 pod_entries 9489
> 
> (XEN) domain_crash called from p2m.c:1127
> 
> (XEN) Domain 4 reported crashed by domain 0 on cpu#0:
> 
> (XEN) printk: 31 messages suppressed.
> 
> (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff 
> (domain 4)
> 
> blktap_sysfs_destroy
> 
> blktap_sysfs_create: adding attributes for dev ffff88012259ca00
> 
> blktap_sysfs_destroy
> 
> blktap_sysfs_create: adding attributes for dev ffff88012259c600
> 
> (XEN) Error: p2m lock held by p2m_change_type
> 
> (XEN) Xen BUG at p2m-ept.c:38
> 
> (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]----
> 
> (XEN) CPU: 6
> 
> (XEN) RIP: e008:[<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150
> 
> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor
> 
> (XEN) rax: 0000000000000000 rbx: ffff83063fdc0000 rcx: 0000000000000092
> 
> (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021e844
> 
> (XEN) rbp: ffff83023fefff28 rsp: ffff83023feffc18 r8: 0000000000000001
> 
> (XEN) r9: 0000000000000001 r10: 0000000000000000 r11: ffff82c4801318d0
> 
> (XEN) r12: ffff8302f5914ef8 r13: 0000000000000001 r14: 0000000000000000
> 
> (XEN) r15: 0000000000003bdf cr0: 0000000080050033 cr4: 00000000000026f0
> 
> (XEN) cr3: 000000063fc2e000 cr2: 00002ba99c046000
> 
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> 
> (XEN) Xen stack trace from rsp=ffff83023feffc18:
> 
> (XEN) 0000000000000002 0000000000000000 0000000000000000 ffff83063fdc0000
> 
> (XEN) ffff8302f5914ef8 0000000000000001 ffff83023feffc70 ffff82c4801df46e
> 
> (XEN) 0000000000000000 ffff83023feffcc4 0000000000003bdf 00000000000001df
> 
> (XEN) ffff8302f5914000 ffff83063fdc0000 ffff83023fefff28 0000000000003bdf
> 
> (XEN) 0000000000000002 0000000000000001 0000000000000030 ffff82c4801bafe4
> 
> (XEN) ffff8302f89dc000 000000043fefff28 ffff83023fefff28 0000000000003bdf
> 
> (XEN) 00000000002f9223 0000000000000030 ffff83023fefff28 ffff82c48019bab1
> 
> (XEN) 0000000000000000 00000001bdc62000 0000000000000000 0000000000000182
> 
> (XEN) ffff8300bdc62000 ffff82c4801b3824 ffff83063fdc0348 07008300bdc62000
> 
> (XEN) ffff83023fe808d0 0000000000000040 000000063fc3601e 0000000000000000
> 
> (XEN) ffff83023fefff28 ffff82c480167d17 ffff82c4802509c0 0000000000000000
> 
> (XEN) 0000000003bdf000 000000000001c000 ffff83023feffdc8 0000000000000080
> 
> (XEN) ffff82c480250dd0 0000000000003bdf 00ff82c480250080 ffff82c480250dc0
> 
> (XEN) ffff82c480250080 ffff82c480250dc0 0000000000004040 0000000000000000
> 
> (XEN) 0000000000004040 0000000000000040 ffff82c4801447da 0000000000000080
> 
> (XEN) ffff83023fefff28 0000000000000092 ffff82c4801a7f6c 00000000000000fc
> 
> (XEN) 0000000000000092 0000000000000006 ffff8300bdc63760 0000000000000006
> 
> (XEN) ffff82c48025c100 ffff82c480250100 ffff82c480250100 0000000000000292
> 
> (XEN) ffff8300bdc637f0 00000249b30f6a00 0000000000000292 ffff82c4801a9383
> 
> (XEN) 00000000000000ef ffff8300bdc62000 ffff8300bdc62000 ffff8300bdc637e8
> 
> (XEN) Xen call trace:
> 
> (XEN) [<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150
> 
> (XEN) [<ffff82c4801df46e>] ept_get_entry+0x1ae/0x1c0
> 
> (XEN) [<ffff82c4801bafe4>] p2m_change_type+0x144/0x1b0
> 
> (XEN) [<ffff82c48019bab1>] hvm_hap_nested_page_fault+0x121/0x190
> 
> (XEN) [<ffff82c4801b3824>] vmx_vmexit_handler+0x304/0x1a90
> 
> (XEN) [<ffff82c480167d17>] __smp_call_function_interrupt+0x57/0x90
> 
> (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70
> 
> (XEN) [<ffff82c4801a7f6c>] vpic_get_highest_priority_irq+0x2c/0xa0
> 
> (XEN) [<ffff82c4801a9383>] pt_update_irq+0x33/0x1e0
> 
> (XEN) [<ffff82c4801a6042>] vlapic_has_pending_irq+0x42/0x70
> 
> (XEN) [<ffff82c4801a0c88>] hvm_vcpu_has_pending_irq+0x88/0xa0
> 
> (XEN) [<ffff82c4801b263b>] vmx_vmenter_helper+0x5b/0x150
> 
> (XEN) [<ffff82c4801ada63>] vmx_asm_do_vmentry+0x0/0xdd
> 
> (XEN)
> 
> (XEN)
> 
> (XEN) ****************************************
> 
> (XEN) Panic on CPU 6:
> 
> (XEN) Xen BUG at p2m-ept.c:38
> 
> (XEN) ****************************************
> 
> (XEN)
> 
> (XEN) Manual reset required ('noreboot' specified)
> 
> ---------------------------------------Works 
> configuration--------------------------------------------------
> 
> And if starts guest like
> 
> xm cr hvm.linux.balloon maxmem=1024 memory=512
> 
> the guest can be successfully logon through VNC
> 
> Any idea on what happens?
> 
> PoD is new to me, I will try to know more, thanks.
> 
> *From:* Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> *Date:* 2010.11.28 10:36
> *sent:* tinnycloud; xen devel
> *cc:* george.dunlap@xxxxxxxxxxxxx
> *subject:* RE: Xen balloon driver discuss
> 
> Am I understanding correctly that you are running each linux-2.6.18 as 
> HVM (not PV)? I didn’t think that the linux-2.6.18 balloon driver worked 
> at all in an HVM guest.
> 
> You also didn’t say what version of Xen you are using. If you are 
> running xen-unstable, you should also provide the changeset number.
> 
> In any case, any load of HVM guests should never crash Xen itself, but 
> if you are running HVM guests, I probably can’t help much as I almost 
> never run HVM guests.
> 
> *From:* cloudroot [mailto:cloudroot@xxxxxxxx]
> *Sent:* Friday, November 26, 2010 11:55 PM
> *To:* tinnycloud; Dan Magenheimer; xen devel
> *Cc:* george.dunlap@xxxxxxxxxxxxx
> *Subject:* re: Xen balloon driver discuss
> 
> Hi Dan:
> 
> I have set the benchmark to test balloon driver, but unfortunately the 
> Xen crashed on memory Panic.
> 
> Before I attach the details output from serial port(which takes time on 
> next run), I am afraid of I might miss something on test environment.
> 
> My dom0 kernel is 2.6.31, pvops.
> 
> Well currently there is no driver/xen/balloon.c on this kernel source 
> tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form
> 
> linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4.
> 
> What I did is put a C program in the each Dom U(total 24 HVM), the 
> program will allocate the memory and fill it with random string repeatly.
> 
> And in dom0, a phthon monitor will collect the meminfo from xenstore and 
> calculate the target to balloon from Committed_AS.
> 
> The panic happens when the program is running in just one Dom.
> 
> I am writing to ask whether my balloon driver is out of date, or where 
> can I get the latest source code,
> 
> I’ve googled a lot, but still have a lot of confusion on those source tree.
> 
> Many thanks.
> 
> *From:* tinnycloud [mailto:tinnycloud@xxxxxxxxxxx]
> *Date:* 2010.11.23 22:58
> *TO:* 'Dan Magenheimer'; 'xen devel'
> *CC:* 'george.dunlap@xxxxxxxxxxxxx'
> *Subject:* re: Xen balloon driver discuss
> 
> HI Dan:
> 
> Appreciate for your presentation in summarizing the memory overcommit, 
> really vivid and in great help.
> 
> Well, I guess recently days the strategy in my mind will fall into the 
> solution Set C in pdf.
> 
> The tmem solution your worked out for memory overcommit is both 
> efficient and effective.
> 
> I guess I will have a try on Linux Guest.
> 
> The real situation I have is most of the running VMs on host are 
> windows. So I had to come up those policies to balance the memory.
> 
> Although policies are all workload dependent. Good news is host workload 
> is configurable, and not very heavy
> 
> So I will try to figure out some favorable policy. The policies referred 
> in pdf are good start for me.
> 
> Today, instead of trying to implement “/proc/meminfo” with shared pages, 
> I hacked the balloon driver to have another
> 
> workqueue periodically write meminfo into xenstore through xenbus, which 
> solve the problem of xenstrore high CPU
> 
> utilization problem.
> 
> Later I will try to google more on how Citrix does.
> 
> Thanks for your help, or do you have any better idea for windows guest?
> 
> *Sent:* Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> *Date:* 2010.11.23 1:47
> *To:* MaoXiaoyun; xen devel
> *CC:* george.dunlap@xxxxxxxxxxxxx
> *Subject:* RE: Xen balloon driver discuss
> 
> Xenstore IS slow and you could improve xenballoond performance by only 
> sending the single CommittedAS value from xenballoond in domU to dom0 
> instead of all of /proc/meminfo. But you are making an assumption that 
> getting memory utilization information from domU to dom0 FASTER (e.g. 
> with a shared page) will provide better ballooning results. I have not 
> found this to be the case, which is what led to my investigation into 
> self-ballooning, which led to Transcendent Memory. See the 2010 Xen 
> Summit for more information.
> 
> In your last paragraph below “Regards balloon strategy”, the problem is 
> it is not easy to define “enough memory” and “shortage of memory” within 
> any guest and almost impossible to define it and effectively load 
> balance across many guests. See my Linux Plumber’s Conference 
> presentation (with complete speaker notes) here:
> 
> http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf
> 
> http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf
> 
> *From:* MaoXiaoyun [mailto:tinnycloud@xxxxxxxxxxx]
> *Sent:* Sunday, November 21, 2010 9:33 PM
> *To:* xen devel
> *Cc:* Dan Magenheimer; george.dunlap@xxxxxxxxxxxxx
> *Subject:* RE: Xen balloon driver discuss
> 
> 
> Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in 
> my opinoin is slow.
> What I want to do is: there is a shared page between domU and dom0, and 
> domU periodically
> update the meminfo into the page, while on the other side dom0 retrive 
> the updated data for
> caculating the target, which is used by guest for balloning.
> 
> The problem I met is, currently I don't know how to implement a shared 
> page between
> dom0 and domU.
> Would it like dom 0 alloc a unbound event and wait guest to connect, and 
> transfer date through
> grant table?
> Or someone has more efficient way?
> many thanks.
> 
>>  From: tinnycloud@xxxxxxxxxxx
>>  To: xen-devel@xxxxxxxxxxxxxxxxxxx
>>  CC: dan.magenheimer@xxxxxxxxxx; George.Dunlap@xxxxxxxxxxxxx
>>  Subject: Xen balloon driver discuss
>>  Date: Sun, 21 Nov 2010 14:26:01 +0800
>>
>>  Hi:
>>  Greeting first.
>>
>>  I was trying to run about 24 HVMS (currently only Linux, later will
>>  involve Windows) on one physical server with 24GB memory, 16CPUs.
>>  Each VM is configured with 2GB memory, and I reserved 8GB memory for
>>  dom0.
>>  For safety reason, only domain U's memory is allowed to balloon.
>>
>>  Inside domain U, I used xenballooned provide by xensource,
>>  periodically write /proc/meminfo into xenstore in dom
>>  0(/local/domain/did/memory/meminfo).
>>  And in domain 0, I wrote a python script to read the meminfo, like
>>  xen provided strategy, use Committed_AS to calculate the domain U balloon
>>  target.
>>  The time interval is ! 1 seconds.
>>
>>  Inside each VM, I setup a apache server for test. Well, I'd
>>  like to say the result is not so good.
>>  It appears that too much read/write on xenstore, when I give some of
>>  the stress(by using ab) to guest domains,
>>  the CPU usage of xenstore is up to 100%. Thus the monitor running in
>>  dom0 also response quite slowly.
>>  Also, in ab test, the Committed_AS grows very fast, reach to maxmem
>>  in short time, but in fact the only a small amount
>>  of memory guest really need, so I guess there should be some more to
>>  be taken into consideration for ballooning.
>>
>>  For xenstore issue, I first plan to wrote a C program inside domain
>>  U to replace xenballoond to see whether the situation
>>  will be refined. If not, how about set up event channel directly for
>>  domU and dom0, would it be faster?
>>
>>  Regards balloon strategy, I would do like this, when there ! are
>>  enough memory , just fulfill the guest balloon request, and when shortage
>>  of memory, distribute memory evenly on the guests those request
>>  inflation.
>>
>>  Does anyone have better suggestion, thanks in advance.
>>
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.