免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 2029 | 回复: 0

10.2.0.5 RAC on Linux上的一个Bug [复制链接]

论坛徽章:
0
发表于 2011-12-23 03:02 |显示全部楼层
系统表现出来的是:“mmon进程lock住了一些sys的对象,然后这个进程的cpu使用率会到100%”
做了debug后,trace文件的内容如下:

*** ACTION NAME:(Remote-Flush Slave Action) 2011-10-25 20:00:08.996
*** MODULE NAME:(MMON_SLAVE) 2011-10-25 20:00:08.996
*** SERVICE NAME:(SYS$BACKGROUND) 2011-10-25 20:00:08.996
*** SESSION ID:(2553.18657) 2011-10-25 20:00:08.996
WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0
WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65483
WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0

ksfdgo()+1488<-ksfdaio1()+9848<-kfkUfsIO()+594<-kfkDoIO()+631<-kfkIOPriv()+616<-kfdIOPriv()+95<-kfioSubmitIO()+503<-kfioRequestPriv()+166<-kfioRequest()+689<-ksfd_osmgo()+1286<-ksfdgo()+1488<-ksfdaio1()+9848<-ksfqwr()+335<-kcflfi()+670<-kcvrsz()+1131<-ktfbfcsz()+657
<-ktfbfxtnd()+237<-ktfbtgex1()+2461<-ktsxs_add()+1480<-ktspnr_next()+1206<-ktr***ec()+437<-ktspbmphwm()+1229<-ktspmvhwm()+49<-ktsp_bump_hwm()+191<-ktspgsp_cbk()+983<-kdisnew()+304<-kdisnewle()+125<-kdisle()+4556<-kdiins0()+26993<-kauxsin()+3965<-insidx()+2509
<-insflush()+466<-insrow()+933<-insdrv()+589<-inscovexe()+399<-in***ecStmtExecIniEngine()+85<-in***e()+384<-opiexe()+9334<-kpoal8()+2295<-opiodr()+1184<-kpoodrc()+38<-rpiswu2()+409<-kpoodr()+554<-upirtrc()+2101<-kpurcsc()+125<-kpuexecv8()+1705<-kpuexec()+2643
<-OCIStmtExecute()+41ssd_unwind_bp: unhandled instruction at 0x14fdbdf instr=6a
ssd_unwind_bp: unhandled instruction at 0x14fc333 instr=68
<-kewrose_oci_stmt_exec()+62<-kewrgwxf1_gwrsql_exft_1()+284<-kewrgwxf_gwrsql_exft()+451<-kewrews_execute_wr_sql()+52<-kewrftbs_flush_table_by_sql()+188<-kewrft_flush_table()+223<-kewrftec_flush_table_ehdlcx()+805<-kewrfat_flush_all_tables()+1243<-kewrfsr_flush_snapshot_r()+173
<-kewrrfs_remote_flush_slave()+1002<-kebm_slave_main()+221<-ksvrdp()+1159<-opirip()+748<-opidrv()+583<-sou2o()+114<-opimai_real()+317<-main()+116<-__libc_start_main()+219<-_start()+42
*** 2011-10-25 23:20:17.038
ssd_unwind_bp: unhandled instruction at 0x14fdbdf instr=6a
ssd_unwind_bp: unhandled instruction at 0x14fc333 instr=68
*** 2011-10-26 08:48:54.726
Received ORADEBUG command 'dump errorstack 3' from process Unix process pid: 1591, image:
*** 2011-10-26 08:48:54.726
ksedmp: internal or fatal error
Current SQL statement for this session:
insert into wrh$_sysstat   (snap_id, dbid, instance_number, stat_id, value)  select    :snap_id, :dbid, :instance_number, stat_id, value  from    v$sysstat  order by    stat_id
----- Call Stack Trace -----
calling              call     entry                argument values in hex     
location             type     point                (? means dubious value)    
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFD6590 ? 7FBFFD65F0 ?
                                                   7FBFFD6530 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFD6590 ? 7FBFFD65F0 ?
                                                   7FBFFD6530 ? 000000000 ?
ksdxfdmp()+1153      call     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFD6590 ? 7FBFFD65F0 ?
                                                   7FBFFD6530 ? 000000000 ?

看到前面加粗的部分就知道个大概了,AIO不足,
session的等待表现为:
SO: 0x159d85068, type: 4, owner: 0x15f94e478, flag: INIT/-/-/0x00
    (session) sid: 2553 trans: (nil), creator: 0x15f94e478, flag: (100051) USR/- BSY/-/-/-/-/-
              DID: 0002-02E5-00000030, short-term DID: 0000-0000-00000000
              txn branch: (nil)
              oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
    service name: SYS$BACKGROUND
    last wait for 'Data file init write' wait_time=0.000016 sec, seconds since wait started=46124
                count=1, intr=100, timeout=ffffffff
                blocking sess=0x(nil) seq=224
    Dumping Session Wait History
     for 'Data file init write' count=1 wait_time=0.000016 sec
                count=1, intr=100, timeout=ffffffff
     for 'Data file init write' count=1 wait_time=0.000016 sec
                count=1, intr=100, timeout=ffffffff
     for 'Data file init write' count=1 wait_time=0.000035 sec
                count=1, intr=100, timeout=ffffffff
     for 'Data file init write' count=1 wait_time=0.614215 sec
                count=1, intr=100, timeout=ffffffff
     for 'CSS operation: action' count=1 wait_time=0.000080 sec
                function_id=41, =0, =0
     for 'CSS initialization' count=1 wait_time=0.000004 sec
解决问题的办法也很简单:
增加fs.aio-max-nr 的值,比如本例中增加到fs.aio-max-nr = 1048576即可以解决该问题,
参考metalink :1313555.1、9949948.8
这个问题归属于一个Bug: 9949948


您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP