内核碎片引起oom
本帖最后由 linuxfellow 于 2015-03-17 09:21 编辑系统碎片太多,导致oom触发。oom启动时,系统还有17兆Bytes空闲空间,但都是order 0, 1, 2的页面,大页全部消耗殆尽。
说明内核有外部碎片问题。能想到的办法就是
1:CONFIG_COMPACT=y
2: 减小应用程序申请内存时的粒度
uclibc里DEFAULT_TRIM_THRESHOLD缺省时256k. 可能太大,改为64k如何?
下面是防止一般oom的方法:
3: 设置DEF_PRIORITY 比较小的数值, scan一次扫描更多的页面,对系统性能影响大吗 ?
569 /*
570* The "priority" of VM scanning is how much of the queues we will scan in one
571* go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the
572* queues ("queue_length >> 12") during an aging round.
573*/
574 #define DEF_PRIORITY 12
4:修改zone_watermark_ok算法,拉大watermark_low/watermark_min的差距,尽早启动kswapd使其有足够的时间回收页面
不知从uclibc/glibc层面上考虑,还有什么好办法没有?
是呀,最近有空研究一下oom,发出来共享 打开CONFIG_COMPACT 已经可以了。 回复 3# gaojl0728
增加了该选项,在做24小时测试,明天看结果
第二项也想试试,但是没有找到uclibc 里对应的配置选项,要修改malloc.h,暂时放下 本帖最后由 linuxfellow 于 2015-03-20 09:54 编辑
典型的内存碎片问题实例, 大页都用光了,只剩小页:
Normal: 4229*4kB 34*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17188kB
还有运行一个狠小的shell程序,至少也要4页内存,app_abc.sh: page allocation failure: order:2, mode:0xd0
Out of memory (oom_kill_allocating_task): Kill process 3470 (app_abc.sh) score 0 or sacrifice child
Killed process 3470 (app_abc.sh) total-vm:1160kB, anon-rss:60kB, file-rss:96kB
app_abc.sh: page allocation failure: order:2, mode:0xd0
[<c001346c>] (unwind_backtrace+0x0/0xec) from [<c005da50>] (warn_alloc_failed+0xc8/0x11c)
[<c005da50>] (warn_alloc_failed+0xc8/0x11c) from [<c005f9f8>] (__alloc_pages_nodemask+0x108/0x618)
[<c005f9f8>] (__alloc_pages_nodemask+0x108/0x618) from [<c005ff18>] (__get_free_pages+0x10/0x4c)
[<c005ff18>] (__get_free_pages+0x10/0x4c) from [<c00158b8>] (pgd_alloc+0x14/0xe0)
[<c00158b8>] (pgd_alloc+0x14/0xe0) from [<c001e25c>] (mm_init.isra.56+0x94/0xd8)
[<c001e25c>] (mm_init.isra.56+0x94/0xd8) from [<c008f3a4>] (bprm_mm_init+0x10/0x188)
[<c008f3a4>] (bprm_mm_init+0x10/0x188) from [<c008f858>] (do_execve+0xf0/0x2a0)
[<c008f858>] (do_execve+0xf0/0x2a0) from [<c0011064>] (sys_execve+0x34/0x54)
[<c0011064>] (sys_execve+0x34/0x54) from [<c000de00>] (ret_fast_syscall+0x0/0x30)
Mem-info:
Normal per-cpu:
CPU 0: hi: 42, btch: 7 usd:10
active_anon:14135 inactive_anon:4244 isolated_anon:0
active_file:58 inactive_file:76 isolated_file:0
unevictable:4254 dirty:5 writeback:1 unstable:0
free:4325 slab_reclaimable:468 slab_unreclaimable:2239
mapped:571 shmem:5183 pagetables:302 bounce:0
Normal free:17188kB min:1440kB low:1800kB high:2160kB active_anon:56540kB inactive_anon:16976kB active_file:252kB inactive_file:524kB unevictable:17016kB isolated(anon):0kB isolated(file):68kB present:130048kB mlocked:0kB dirty:20kB writeback:4kB mapped:2356kB shmem:20732kB slab_reclaimable:1872kB slab_unreclaimable:8956kB kernel_stack:1768kB pagetables:1208kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:17 all_unreclaimable? no
lowmem_reserve[]: 0 0
Normal: 4229*4kB 34*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17188kB
9637 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap= 0kB
Total swap = 0kB
32768 pages of RAM
4765 free pages
1553 reserved pages
2707 slab pages
5949 pages shared
0 pages swap cached
页:
[1]