coologin 发表于 2012-02-12 15:28

oracle rac集群故障蓝屏重启,求解。

ORACLE RAC:由三台PC SERVER组成,ORACLE 10g,WINDOWS2003,
两台 HPDL380G5,双处理器,4G内存。HBA卡为4G。
一台 HPDL570G2,四处理器,4G内存。HBA卡为2G。
存储:HP EVA4400

平常工作下每台内存占用2.4G,CPU10%。


自07年集群7年来不间断重启,2011年的统计结果显示共计20次,,每台服务器都会重启,无明显规律。

心跳线是直接连接核心网上的交换机,由于集群是2003年布置的,到现在有快十年了,网络是否是瓶颈??

心跳线是否需要提供 CACHE FUSION通信?需要高速的通信吗?


通过 DUMP文件发现如下:


Name
--------
=== ODM Data Collection ===

ocssd reboot the node due to 2of3 voting disk were not reachable

FileName
----------------
ocssd.log of node2

FileComment
----------------------
[ CSSD]2012-01-12 14:21:00.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 5531 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:00.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 6453 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:01.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 4531 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:01.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 5453 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:02.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 3531 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:02.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 4453 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:03.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 2531 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:03.270 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 3453 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:03.286 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 2515 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:03.286 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 3437 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:03.286 >TRACE: clssnmDiskPMT: offline disk (-2135307061 ms) (3/DGSFDB)
[ CSSD]2012-01-12 14:21:04.286 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 1515 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:04.286 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 2437 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:04.426 >TRACE: clssnmSendingThread: sending status msg to all nodes
[ CSSD]2012-01-12 14:21:04.426 >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[ CSSD]2012-01-12 14:21:05.286 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 515 ms, disk (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:05.286 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 1437 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:05.817 >TRACE: clssnmDiskPMT: stale disk (200016 ms) (1/\\.\VOTEDSKA)
[ CSSD]2012-01-12 14:21:05.817 >WARNING: clssnmDiskPMT: voting device hang at 90% fatal, termination in 906 ms, disk (2/\\.\VOTEDSKB)
[ CSSD]2012-01-12 14:21:05.817 >ERROR: clssnmDiskPMT: Aborting, 2 of 3 voting disks unavailable
[ CSSD]2012-01-12 14:21:05.817 >ERROR: ###################################
[ CSSD]2012-01-12 14:21:05.817 >ERROR: clssscExit: CSSD aborting from thread clssnmvDiskPingMonitorThread
[ CSSD]2012-01-12 14:21:05.817 >ERROR: ###################################

coologin 发表于 2012-02-12 15:55

组网图

一直没明白cache fusion是否通过心跳线进行通信。
节点是如何控制VOTING DISK

duolanshizhe 发表于 2012-02-13 10:53

crs无法访问voting disk   导致crs 退出

renxiao2003 发表于 2012-02-27 23:39

学习了。有机会得装个实验环境了。

zhlin0054 发表于 2012-08-13 08:15

向各位学习
页: [1]
查看完整版本: oracle rac集群故障蓝屏重启,求解。