免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 3800 | 回复: 8
打印 上一主题 下一主题

E4500异常死机,请帮助分析日志 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2009-05-12 10:31 |只看该作者 |倒序浏览
昨晚一台sun E4500服务器异常重启,日志如下,看上去像内存问题,但是日志里自己又说可能不是(红色字体这段),请大家来分析一下。(太长了,分两段发)\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 693101 kern.info] NOTICE: [AFT2] errID 0x0013ba98.ff65c20c DBI event on CPU8\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 621741 kern.info] [AFT2] errID 0x0013ba98.ff65c20c PA=0x00000000.cb1fc500\r\nMay 11 17:54:43 sun001     E$tag 0x00000000.1b401963 E$State: Owner E$parity 0x0d \r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.14542be8\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x0: 0x00000000.104ac2a8\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000300.1da9c2f8\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x1: 0x00000300.1da9c3d8\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x2: 0x00000300.86bbf440 *Bad* PSYND=0x0008\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000\r\nMay 11 17:54:43 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x3: 0x00000300.06bbf440\r\nMay 11 17:54:48 sun001 SUNW,UltraSPARC-II: [ID 783503 kern.warning] WARNING: [AFT1] WP event on CPU8, errID 0x0013ba9a.11b4d63a\r\nMay 11 17:54:48 sun001     AFSR 0x00000000.00800008<WP> AFAR 0x000001fb.bbfbfff0\r\nMay 11 17:54:48 sun001     AFSR.PSYND 0x0008(Score 95) AFSR.ETS 0x00 Fault_PC 0x1450de4\r\nMay 11 17:54:48 sun001     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 365970 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU13 Data access at TL=0, errID 0x0013ba9b.e49f0626\r\nMay 11 17:54:56 sun001     AFSR 0x00000000.80200000<RIV,UE> AFAR 0x00000000.cb1fc528\r\nMay 11 17:54:56 sun001     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10025334\r\nMay 11 17:54:56 sun001     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE> UDBL.ESYND 0x03\r\nMay 11 17:54:56 sun001     UDBL Syndrome 0x3 Memory Module Board 0 J3101 J3201 J3301 J3401 J3501 J3601 J3701 J3801\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 513264 kern.warning] WARNING: [AFT1] errID 0x0013ba9b.e49f0626 Syndrome 0x3 indicates that this may not be a memory module problem\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 416178 kern.info] [AFT2] errID 0x0013ba9b.e49f0626 PA=0x00000000.cb1fc528\r\nMay 11 17:54:56 sun001     E$tag 0x00000000.1ac01963 E$State: Exclusive E$parity 0x0d \r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.14542be8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x0: 0x00000000.104ac2a8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000300.1da9c2f8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x1: 0x00000300.1da9c3d8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x2: 0x00000300.86bbf440 *Bad* PSYND=0x00ff\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x3: 0x00000300.06bbf440\r\nMay 11 17:54:56 sun001 unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.cb1fc000\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 288640 kern.info] [AFT3] errID 0x0013ba9b.e49f0626 Above Error detected by protected Kernel code\r\nMay 11 17:54:56 sun001     that will try to clear error from system\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 181349 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU13 Data access at TL=0, errID 0x0013ba9b.e75ade79\r\nMay 11 17:54:56 sun001     AFSR 0x00000000.80200000<RIV,UE> AFAR 0x00000000.cb1fc528\r\nMay 11 17:54:56 sun001     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10025334\r\nMay 11 17:54:56 sun001     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE> UDBL.ESYND 0x03\r\nMay 11 17:54:56 sun001     UDBL Syndrome 0x3 Memory Module Board 0 J3101 J3201 J3301 J3401 J3501 J3601 J3701 J3801\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 861501 kern.warning] WARNING: [AFT1] errID 0x0013ba9b.e75ade79 Syndrome 0x3 indicates that this may not be a memory module problem\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 387493 kern.info] [AFT2] errID 0x0013ba9b.e75ade79 PA=0x00000000.cb1fc528\r\nMay 11 17:54:56 sun001     E$tag 0x00000000.1ac01963 E$State: Exclusive E$parity 0x0d \r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.14542be8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x0: 0x00000000.104ac2a8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000300.1da9c2f8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x1: 0x00000300.1da9c3d8\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x00000300.86bbf440 *Bad* PSYND=0x00ff\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000300.06bbf440\r\nMay 11 17:54:56 sun001 unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.cb1fc000\r\nMay 11 17:54:56 sun001 SUNW,UltraSPARC-II: [ID 282690 kern.info] [AFT3] errID 0x0013ba9b.e75ade79 Above Error detected by protected Kernel code\r\nMay 11 17:54:56 sun001     that will try to clear error from system\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 125562 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU4 Data access at TL=0, errID 0x0013ba9e.a14450bf\r\nMay 11 17:55:08 sun001     AFSR 0x00000000.80200000<RIV,UE> AFAR 0x00000000.cb1fc528\r\nMay 11 17:55:08 sun001     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1032cadc\r\nMay 11 17:55:08 sun001     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE> UDBL.ESYND 0x03\r\nMay 11 17:55:08 sun001     UDBL Syndrome 0x3 Memory Module Board 0 J3101 J3201 J3301 J3401 J3501 J3601 J3701 J3801\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 111876 kern.warning] WARNING: [AFT1] errID 0x0013ba9e.a14450bf Syndrome 0x3 indicates that this may not be a memory module problem\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 512893 kern.info] [AFT2] errID 0x0013ba9e.a14450bf PA=0x00000000.cb1fc528\r\nMay 11 17:55:08 sun001     E$tag 0x00000000.1ac01963 E$State: Exclusive E$parity 0x0d \r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.14542be8\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.104ac2a8\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000300.1da9c2f8\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00000300.1da9c3d8\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x00000300.86bbf440 *Bad* PSYND=0x00ff\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000\r\nMay 11 17:55:08 sun001 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000300.06bbf440\r\nMay 11 17:55:08 sun001 unix: [ID 836849 kern.notice] \r\nMay 11 17:55:08 sun001 ^Mpanic[cpu4]/thread=3000404d1a0: \r\nMay 11 17:55:08 sun001 unix: [ID 377634 kern.notice] [AFT1] errID 0x0013ba9e.a14450bf UE Error(s)\r\nMay 11 17:55:08 sun001     See previous message(s) for details\r\nMay 11 17:55:08 sun001 unix: [ID 100000 kern.notice] \r\nMay 11 17:55:08 sun001 genunix: [ID 723222 kern.notice] 000002a101556ef0 SUNW,UltraSPARC-II:cpu_aflt_log+568 (2a101556fae, 1, 1014e328, 2a101557138, 2a101556ffb, 1014e350)\r\nMay 11 17:55:08 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 0000000000000000 0000000000000003 000002a101557200 0000000000000010\r\nMay 11 17:55:08 sun001   %l4-7: 000000000020294c 0000000010412f60 00000000001dc387 00000000ff23c000\r\nMay 11 17:55:09 sun001 genunix: [ID 723222 kern.notice] 000002a101557140 SUNW,UltraSPARC-II:cpu_async_error+868 (1046a270, 2a101557200, 80200000, 0, 650180080200000, 2a1015573c0)\r\nMay 11 17:55:09 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 000000001040db3c 0000000000000032 0000000000000203 0000000000000000\r\nMay 11 17:55:09 sun001   %l4-7: 00000000cb1fc500 0000000000800000 0000000000800000 0000000000000001\r\nMay 11 17:55:09 sun001 genunix: [ID 723222 kern.notice] 000002a101557310 unix:prom_rtt+0 (3000d8c46e4, 3000d8c4718, 1, 0, 2000, 3000d8c4718)\r\nMay 11 17:55:09 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 0000000000000002 0000000000001400 0000004400001601 0000000010145974\r\nMay 11 17:55:09 sun001   %l4-7: 0000000010433ee8 00000000001484d4 0000000000000000 000002a1015573c0\r\nMay 11 17:55:09 sun001 genunix: [ID 723222 kern.notice] 000002a101557460 genunix:struioget+14c (34, 3000de03840, 2a101557730, 3000c9984f0, 1, 3000d8c4620)\r\nMay 11 17:55:09 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 0000030000131900 00000000001db5cc 000000000000004e 00000000001dc378\r\nMay 11 17:55:09 sun001   %l4-7: 00000000001dce74 00000000001dc2dc 00000000fee7c0c4 0000000000000000\r\nMay 11 17:55:09 sun001 genunix: [ID 723222 kern.notice] 000002a101557520 tcp:tcp_wrw+54 (2a101557730, 3000de03840, 3001da9c3d8, 0, 1dab40, 1e2950)\r\nMay 11 17:55:09 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 00000000001ed308 0000000000000001 0000000000202e46 00000000002435f0\r\nMay 11 17:55:09 sun001   %l4-7: 0000000000000000 0000000000000001 0000000000000001 0000000000000000\r\nMay 11 17:55:10 sun001 genunix: [ID 723222 kern.notice] 000002a1015575d0 genunix:rwnext+23c (3001da9c440, 3001da9c500, 0, 3001da9c3d8, 2a101557730, 1032ca14)\r\nMay 11 17:55:10 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 000003001be20ad8 000003001da9c4b8 000003000de03840 000002a101557a00\r\nMay 11 17:55:10 sun001   %l4-7: 000000000000004c 0000000000000000 000002a101557868 0000000000000000\r\nMay 11 17:55:10 sun001 genunix: [ID 723222 kern.notice] 000002a101557680 genunix:strput+38c (0, 2a101557a00, 3001be20ad8, 8, 0, 0)\r\nMay 11 17:55:10 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 000002a101557930 0000000000000000 00000000001e2950 00000000001dab40\r\nMay 11 17:55:10 sun001   %l4-7: 0000000000000000 00000000104133e8 0000000000000004 0000000000000000\r\nMay 11 17:55:10 sun001 genunix: [ID 723222 kern.notice] 000002a101557870 genunix:strwrite+200 (850, 2a101557930, 300201ce640, 1000000, 3000f4bedd8, 2a101557a00)\r\nMay 11 17:55:10 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 000003001856b068 0000000000004000 000003001be20ad8 0000000000000003\r\nMay 11 17:55:10 sun001   %l4-7: 0000000000000001 000003001856b0e8 00000000001e2950 0000000000000000\r\nMay 11 17:55:10 sun001 genunix: [ID 723222 kern.notice] 000002a101557940 genunix:write+204 (18322, 34, 3, 30015d35508, 7, 34)\r\nMay 11 17:55:10 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 0000000010309b20 0000000000000034 000003000f4bedd8 0000000000000000\r\nMay 11 17:55:10 sun001   %l4-7: 000000000020294c 0000000010412f60 00000000001dc387 00000000ff23c000\r\nMay 11 17:55:11 sun001 genunix: [ID 723222 kern.notice] 000002a101557a40 genunix:write32+30 (7, 202e46, 34, 1dc378, 21c0c, fedbba30)\r\nMay 11 17:55:11 sun001 genunix: [ID 179002 kern.notice]   %l0-3: 0000000000000004 00000300173ad570 00000000001dc67c 00000000001dc5a8\r\nMay 11 17:55:11 sun001   %l4-7: 0000000000246014 0000000000245f94 00000000001dc674 0000000000000000\r\nMay 11 17:55:11 sun001 unix: [ID 100000 kern.notice] \r\nMay 11 17:55:11 sun001 genunix: [ID 672855 kern.notice] syncing file systems...\r\nMay 11 17:55:13 sun001 genunix: [ID 733762 kern.notice]  21\r\nMay 11 17:55:14 sun001 genunix: [ID 733762 kern.notice]  8\r\nMay 11 17:55:53 sun001 last message repeated 20 times\r\nMay 11 17:55:54 sun001 genunix: [ID 622722 kern.notice]  done (not all i/o completed)\r\nMay 11 17:55:55 sun001 genunix: [ID 353387 kern.notice] dumping to /dev/dsk/c2t10d0s1, offset 1718222848\r\nMay 11 17:58:41 sun001 genunix: [ID 409368 kern.notice] ^M100% done: 104380 pages dumped, compression ratio 2.46, \r\nMay 11 17:58:41 sun001 genunix: [ID 851671 kern.notice] dump succeeded\n\n[ 本帖最后由 shediaofan 于 2009-5-12 10:42 编辑 ]

论坛徽章:
0
2 [报告]
发表于 2009-05-12 16:46 |只看该作者
重启完以后,现在正常吗?有新的日志产生吗?

论坛徽章:
0
3 [报告]
发表于 2009-05-12 17:48 |只看该作者
感觉0号CPU板的某个CPU\n\n[ 本帖最后由 doging 于 2009-5-12 17:53 编辑 ]

论坛徽章:
0
4 [报告]
发表于 2009-05-19 08:22 |只看该作者
重启完到现在正常,日志正常,找了个机会到ok状态下开最大化自检,居然没有任何报错。难道是遇到了传说中的bug,需要打补丁,但是坛子里的帖子我没看懂到底需要打哪个补丁

论坛徽章:
2
双鱼座
日期:2014-02-23 12:10:03操作系统版块每日发帖之星
日期:2015-12-17 06:20:00
5 [报告]
发表于 2009-05-19 10:43 |只看该作者
CPU和内存都有问题,建议换了报错的CPU和内存

论坛徽章:
0
6 [报告]
发表于 2009-05-19 11:46 |只看该作者
我也认为是CPU的问题,而且是CPU13。我建议先把CPU换了。内存的可能性不大。\r\n\r\n还有你自检的时候,要在OK模式下用那个obdiag至少测试20次才行。\r\n\r\n另外这种老机器的串口不能保存当时出现问题的信息的。我建议你还是用串口接一台终端并且将信息保存下来。\r\n\r\n当再出现问题的时候,会往终端上输出的。那个输出的信息很重要。\r\n\r\n而且这次产生的CORE文件,你也可以用CAT分析一下吧。

论坛徽章:
1
IT运维版块每日发帖之星
日期:2016-02-27 06:20:00
7 [报告]
发表于 2009-05-19 15:16 |只看该作者
我碰到的4500是没有coredump,直接AFT报错后突然重启,没有指出内存。痛苦啊。\n\n[ 本帖最后由 myniker 于 2009-5-19 15:18 编辑 ]

论坛徽章:
0
8 [报告]
发表于 2009-05-19 15:32 |只看该作者
学习中。。。。。。。。。。。。。

论坛徽章:
0
9 [报告]
发表于 2009-05-19 17:01 |只看该作者
   风版还学习中············ \r\n飘过。·········捡米
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP