[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Kernel 3.11 / 3.12 OOM killer and Xen ballooning



Bob Liu wrote:
On 12/20/2013 03:08 AM, James Dingwall wrote:
Bob Liu wrote:
On 12/12/2013 12:30 AM, James Dingwall wrote:
Bob Liu wrote:
On 12/10/2013 11:27 PM, Konrad Rzeszutek Wilk wrote:
On Tue, Dec 10, 2013 at 02:52:40PM +0000, James Dingwall wrote:
Konrad Rzeszutek Wilk wrote:
On Mon, Dec 09, 2013 at 05:50:29PM +0000, James Dingwall wrote:
Hi,

Since 3.11 I have noticed that the OOM killer quite frequently
triggers in my Xen guest domains which use ballooning to
increase/decrease their memory allocation according to their
requirements.  One example domain I have has a maximum memory
setting of ~1.5Gb but it usually idles at ~300Mb, it is also
configured with 2Gb swap which is almost 100% free.

# free
                total       used       free     shared    buffers
cached
Mem:        272080     248108      23972          0 1448      63064
-/+ buffers/cache:     183596      88484
Swap:      2097148          8    2097140

There is plenty of available free memory in the hypervisor to
balloon to the maximum size:
# xl info | grep free_mem
free_memory            : 14923

An example trace (they are always the same) from the oom killer in
3.12 is added below.  So far I have not been able to reproduce this
at will so it is difficult to start bisecting it to see if a
particular change introduced this.  However it does seem that the
behaviour is wrong because a) ballooning could give the guest more
memory, b) there is lots of swap available which could be used as a
fallback.
Keep in mind that swap with tmem is actually no more swap. Heh, that
sounds odd -but basically pages that are destined for swap end up
going in the tmem code which pipes them up to the hypervisor.

If other information could help or there are more tests that I
could
run then please let me know.
I presume you have enabled 'tmem' both in the hypervisor and in
the guest right?
Yes, domU and dom0 both have the tmem module loaded and  tmem
tmem_dedup=on tmem_compress=on is given on the xen command line.
Excellent. The odd thing is that your swap is not used that much, but
it should be (as that is part of what the self-balloon is suppose to
do).

Bob, you had a patch for the logic of how self-balloon is suppose
to account for the slab - would this be relevant to this problem?

Perhaps, I have attached the patch.
James, could you please apply it and try your application again? You
have to rebuild the guest kernel.
Oh, and also take a look at whether frontswap is in use, you can check
it by watching "cat /sys/kernel/debug/frontswap/*".
I have tested this patch with a workload where I have previously seen
failures and so far so good.  I'll try to keep a guest with it stressed
to see if I do get any problems.  I don't know if it is expected but I
By the way, besides longer time of kswapd, is this patch work well
during your stress testing?

Have you seen the OOM killer triggered quite frequently again?(with
selfshrink=true)

Thanks,
-Bob
It was looking good until today (selfshrink=true).  The trace below is
during a compile of subversion, it looks like the memory has ballooned
to almost the maximum permissible but even under pressure the swap disk
has hardly come in to use.

So if without selfshrink the swap disk can be used a lot?

If that's the case, I'm afraid the frontswap-selfshrink in
xen-selfballoon did something incorrect.

Could you please try this patch which make the frontswap-selfshrink
slower and add a printk for debug.
Please still keep selfshrink=true in your test but can with or without
my previous patch.
Thanks a lot!

The oom trace below was triggered during a compile of gcc. I have the full dmesg from boot which shows all the printks, please let me know if you would like to see that.

James


[504372.929678] frontswap selfshrink 5424 pages
[504403.018185] frontswap selfshrink 5152 pages
[504433.124844] frontswap selfshrink 4894 pages
[504468.335358] frontswap selfshrink 12791 pages
[504498.536467] frontswap selfshrink 12152 pages
[504533.813484] frontswap selfshrink 19751 pages
[504589.067299] frontswap selfshrink 19043 pages
[504638.441894] cc1plus invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
[504638.441902] CPU: 1 PID: 21506 Comm: cc1plus Tainted: G W    3.12.5 #88
[504638.441905] ffff88001ca406f8 ffff880002c0fa58 ffffffff8148f200 ffff88001f90e8e8 [504638.441909] ffff88001ca401c0 ffff880002c0faf8 ffffffff8148ccf7 ffff880002c0faa8 [504638.441912] ffffffff810f8d97 ffff880002c0fa88 ffffffff81006dc8 ffff880002c0fa98
[504638.441917] Call Trace:
[504638.441928]  [<ffffffff8148f200>] dump_stack+0x46/0x58
[504638.441932]  [<ffffffff8148ccf7>] dump_header.isra.9+0x6d/0x1cc
[504638.441938]  [<ffffffff810f8d97>] ? super_cache_count+0xa8/0xb8
[504638.441943]  [<ffffffff81006dc8>] ? xen_clocksource_read+0x20/0x22
[504638.441946]  [<ffffffff81006ea9>] ? xen_clocksource_get_cycles+0x9/0xb
[504638.441951] [<ffffffff81494abe>] ? _raw_spin_unlock_irqrestore+0x47/0x62
[504638.441957]  [<ffffffff81296b27>] ? ___ratelimit+0xcb/0xe8
[504638.441962]  [<ffffffff810b2bbf>] oom_kill_process+0x70/0x2fd
[504638.441966]  [<ffffffff810bca0e>] ? zone_reclaimable+0x11/0x1e
[504638.441970]  [<ffffffff81048779>] ? has_ns_capability_noaudit+0x12/0x19
[504638.441973]  [<ffffffff81048792>] ? has_capability_noaudit+0x12/0x14
[504638.441976]  [<ffffffff810b32de>] out_of_memory+0x31b/0x34e
[504638.441981]  [<ffffffff810b7438>] __alloc_pages_nodemask+0x65b/0x792
[504638.441985]  [<ffffffff810e3da3>] alloc_pages_vma+0xd0/0x10c
[504638.441988] [<ffffffff81003f69>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[504638.441993]  [<ffffffff810cf7cd>] handle_mm_fault+0x6d4/0xd54
[504638.441996]  [<ffffffff81006dc8>] ? xen_clocksource_read+0x20/0x22
[504638.441999]  [<ffffffff810115d2>] ? sched_clock+0x9/0xd
[504638.442005]  [<ffffffff8106772f>] ? sched_clock_local+0x12/0x75
[504638.442008]  [<ffffffff8106823b>] ? arch_vtime_task_switch+0x81/0x86
[504638.442013]  [<ffffffff81037f40>] __do_page_fault+0x3d8/0x437
[504638.442016]  [<ffffffff81062f1e>] ? finish_task_switch+0xe8/0x144
[504638.442018]  [<ffffffff810115d2>] ? sched_clock+0x9/0xd
[504638.442021]  [<ffffffff8106772f>] ? sched_clock_local+0x12/0x75
[504638.442025]  [<ffffffff810a45cc>] ? __acct_update_integrals+0xb4/0xbf
[504638.442028]  [<ffffffff810a493f>] ? acct_account_cputime+0x17/0x19
[504638.442030]  [<ffffffff81067c28>] ? account_user_time+0x67/0x92
[504638.442033]  [<ffffffff8106811b>] ? vtime_account_user+0x4d/0x52
[504638.442036]  [<ffffffff81037fd8>] do_page_fault+0x1a/0x5a
[504638.442041]  [<ffffffff810a065f>] ? rcu_user_enter+0xe/0x10
[504638.442044]  [<ffffffff81495158>] page_fault+0x28/0x30
[504638.442046] Mem-Info:
[504638.442048] Node 0 DMA per-cpu:
[504638.442050] CPU    0: hi:    0, btch:   1 usd:   0
[504638.442052] CPU    1: hi:    0, btch:   1 usd:   0
[504638.442053] Node 0 DMA32 per-cpu:
[504638.442055] CPU    0: hi:  186, btch:  31 usd:  26
[504638.442057] CPU    1: hi:  186, btch:  31 usd:  72
[504638.442058] Node 0 Normal per-cpu:
[504638.442060] CPU    0: hi:    0, btch:   1 usd:   0
[504638.442061] CPU    1: hi:    0, btch:   1 usd:   0
[504638.442067] active_anon:103684 inactive_anon:103733 isolated_anon:0
 active_file:10897 inactive_file:15059 isolated_file:0
 unevictable:0 dirty:1 writeback:0 unstable:0
 free:1164 slab_reclaimable:2356 slab_unreclaimable:3421
 mapped:4413 shmem:200 pagetables:2699 bounce:0
 free_cma:0 totalram:249264 balloontarget:315406
[504638.442069] Node 0 DMA free:1964kB min:88kB low:108kB high:132kB active_anon:4664kB inactive_anon:4736kB active_file:628kB inactive_file:1420kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB managed:15084kB mlocked:0kB dirty:0kB writeback:0kB mapped:228kB shmem:0kB slab_reclaimable:184kB slab_unreclaimable:324kB kernel_stack:48kB pagetables:256kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:21824 all_unreclaimable? yes
[504638.442078] lowmem_reserve[]: 0 469 469 469
[504638.442081] Node 0 DMA32 free:2692kB min:2728kB low:3408kB high:4092kB active_anon:175172kB inactive_anon:175184kB active_file:21244kB inactive_file:35340kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:507904kB managed:458288kB mlocked:0kB dirty:0kB writeback:0kB mapped:7764kB shmem:676kB slab_reclaimable:6628kB slab_unreclaimable:11496kB kernel_stack:1720kB pagetables:8444kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:613279 all_unreclaimable? yes
[504638.442088] lowmem_reserve[]: 0 0 0 0
[504638.442091] Node 0 Normal free:0kB min:0kB low:0kB high:0kB active_anon:234900kB inactive_anon:235012kB active_file:21716kB inactive_file:23476kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:523684kB mlocked:0kB dirty:4kB writeback:0kB mapped:9660kB shmem:124kB slab_reclaimable:2612kB slab_unreclaimable:1864kB kernel_stack:136kB pagetables:2096kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:773613 all_unreclaimable? yes
[504638.442098] lowmem_reserve[]: 0 0 0 0
[504638.442101] Node 0 DMA: 1*4kB (R) 3*8kB (R) 1*16kB (R) 0*32kB 0*64kB 1*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 1964kB [504638.442114] Node 0 DMA32: 673*4kB (UE) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2692kB [504638.442123] Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
[504638.442131] 22294 total pagecache pages
[504638.442133] 11197 pages in swap cache
[504638.442135] Swap cache stats: add 3449125, delete 3437928, find 590699/956067
[504638.442136] Free swap  = 1868108kB
[504638.442137] Total swap = 2097148kB
[504638.452335] 262143 pages RAM
[504638.452336] 6697 pages reserved
[504638.452337] 558286 pages shared
[504638.452338] 239987 pages non-shared
[504638.452340] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
<snip process list>
[504638.452596] Out of memory: Kill process 21506 (cc1plus) score 123 or sacrifice child [504638.452598] Killed process 21506 (cc1plus) total-vm:543168kB, anon-rss:350300kB, file-rss:9520kB
[504659.367289] frontswap selfshrink 18428 pages
[504689.415694] frontswap selfshrink 479 pages
[504719.462401] frontswap selfshrink 456 pages
[504749.506876] frontswap selfshrink 434 pages
[504779.558204] frontswap selfshrink 406 pages
[504809.604425] frontswap selfshrink 386 pages
[504839.654849] frontswap selfshrink 367 pages



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.