jsnjycy 发表于 2012-07-26 10:18

RAC中三台中的两台会重启

主机是三台HP DL580 G5都是双光纤卡连接到两台IBM的光交1和2,两台光交分别连接到一台DS4700的AB两个控制器 ,也就是说光交1连控制器A,光交2连控制器B
多路径装的是“linuxrdac-09.03.0B05.0331”
操作系统是“CentOS release 4.8 (Final)”
oracle 版本如下

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE        10.2.0.4.0        Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production


看了一下/var/log/messages
有如下信息
Jul 26 07:22:11 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 07:22:11 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 07:22:11 oraclerac3 kernel:
Jul 26 07:32:11 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 07:32:11 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 07:32:11 oraclerac3 kernel:
Jul 26 07:42:12 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 07:42:12 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 07:42:12 oraclerac3 kernel:
Jul 26 07:52:12 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 07:52:12 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 07:52:12 oraclerac3 kernel:
Jul 26 08:02:11 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 08:02:11 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 08:02:11 oraclerac3 kernel:
Jul 26 08:12:14 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 08:12:14 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 08:12:14 oraclerac3 kernel:
Jul 26 08:22:12 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 08:22:12 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 08:22:12 oraclerac3 kernel:
Jul 26 08:32:12 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 08:32:12 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 08:32:12 oraclerac3 kernel:
Jul 26 08:42:14 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 08:42:14 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 08:42:14 oraclerac3 kernel:
Jul 26 08:52:10 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 08:52:10 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 08:52:10 oraclerac3 kernel:
Jul 26 09:02:14 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 09:02:14 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 09:02:14 oraclerac3 kernel:
Jul 26 09:12:12 oraclerac3 kernel:       blocks= 286677119 block_size= 512
Jul 26 09:12:12 oraclerac3 kernel:       heads= 255, sectors= 32, cylinders= 35132
Jul 26 09:12:12 oraclerac3 kernel:
Jul 26 09:13:30 oraclerac3 kernel: 122 DS4700_01:0:0:1 Controller IO time expired. Delta 304 secs
Jul 26 09:13:30 oraclerac3 kernel: 497 DS4700_01:0:0:1 Failed controller to 1. retry. vcmnd SN 27916509 pdev H4:C0:T0:L1 0x00/0x00/0x00 0x0600
0000 mpp_status:8
Jul 26 09:13:30 oraclerac3 kernel: 10 DS4700_01:1 Failover command issued
Jul 26 09:13:31 oraclerac3 su(pam_unix): session closed for user oracle
Jul 26 09:17:23 oraclerac3 syslogd 1.4.1: restart.
Jul 26 09:17:23 oraclerac3 syslog: syslogd 启动 succeeded
Jul 26 09:17:23 oraclerac3 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Jul 26 09:17:23 oraclerac3 kernel: Bootdata ok (command line is ro root=/dev/VolGroup00/LogVol00 rhgb quiet)
Jul 26 09:17:23 oraclerac3 kernel: Linux version 2.6.9-89.ELlargesmp (mockbuild@builder10.centos.org) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-11)) #1 SMP
Mon Jun 22 12:46:58 EDT 2009


jsnjycy 发表于 2012-07-26 10:25

是主机连接盘柜的问题吗,但是盘柜好像没有报错

Date/Time: 12-7-26 9:14:26
Sequence number: 359628
Event type: 2023
Event category: Internal
Priority: Informational
Description: Media scan (scrub) completed
Event specific codes: 0/0/0
Component type: Logical Drive
Component location: Logical Drive 1
Logged by: Controller in slot B

Raw data:
4d 45 4c 48 03 00 00 00 cc 7c 05 00 00 00 00 00
23 20 4d 00 f2 99 10 50 00 00 00 10 00 00 00 00
00 00 00 00 04 00 00 00 0d 00 00 00 0d 00 00 00
02 00 00 00 00 31 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 01 01 14 00 00 00 10 00 18 06
05 00 00 00 09 00 00 00 00 00 00 00 00 00 00 00



Date/Time: 12-7-26 9:14:25
Sequence number: 359627
Event type: 210A
Event category: Internal
Priority: Informational
Description: Controller cache not enabled or was internally disabled
Event specific codes: 0/0/0
Component type: Controller
Component location: Enclosure 85, Slot 2
Logged by: Controller in slot B

Raw data:
4d 45 4c 48 03 00 00 00 cb 7c 05 00 00 00 00 00
0a 21 48 00 f1 99 10 50 00 00 00 00 00 00 00 00
00 00 00 00 04 00 00 00 22 00 00 00 22 00 00 00
08 00 00 00 55 00 00 00 02 00 00 00 01 00 00 00
0a 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 01 00 00 00 00 00


Date/Time: 12-7-26 9:13:55
Sequence number: 359626
Event type: 2024
Event category: Internal
Priority: Informational
Description: Media scan (scrub) resumed
Event specific codes: 0/0/0
Component type: Logical Drive
Component location: Logical Drive 1
Logged by: Controller in slot B

Raw data:
4d 45 4c 48 03 00 00 00 ca 7c 05 00 00 00 00 00
24 20 4d 00 d3 99 10 50 00 00 00 10 00 00 00 00
00 00 00 00 04 00 00 00 0d 00 00 00 0d 00 00 00
02 00 00 00 00 31 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 01 01 10 00 00 00 0c 00 13 06
01 00 00 00 00 00 00 00 00 e0 4e 49


Date/Time: 12-7-26 9:13:55
Sequence number: 359625
Event type: 300D
Event category: Command
Priority: Informational
Description: Mode select for redundant controller page 2C received
Event specific codes: 0/0/0
Component type: Controller
Component location: Enclosure 85, Slot 2
Logged by: Controller in slot B

Raw data:
4d 45 4c 48 03 00 00 00 c9 7c 05 00 00 00 00 00
0d 30 38 00 d3 99 10 50 00 00 00 00 06 00 00 00
00 00 00 00 03 00 00 00 22 00 00 00 22 00 00 00
08 00 00 00 55 00 00 00 02 00 00 00 01 00 00 00
0a 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 01 04 7c 00 00 00 20 00 1a 06
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20 00 1a 86 00 01 00 00 14 02 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 20 00 1a 86 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 0a 00 1a 86 00 00 00 00
00 00 00 00 00 00 00 00


Date/Time: 12-7-26 9:13:53
Sequence number: 359624
Event type: 210A
Event category: Internal
Priority: Informational
Description: Controller cache not enabled or was internally disabled
Event specific codes: 0/0/0
Component type: Controller
Component location: Enclosure 85, Slot 2
Logged by: Controller in slot B

Raw data:
4d 45 4c 48 03 00 00 00 c8 7c 05 00 00 00 00 00
0a 21 48 00 d1 99 10 50 00 00 00 00 00 00 00 00
00 00 00 00 04 00 00 00 22 00 00 00 22 00 00 00
08 00 00 00 55 00 00 00 02 00 00 00 01 00 00 00
0a 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 01 00 00 00 00 00


Date/Time: 12-7-26 9:14:26
Sequence number: 359623
Event type: 2024
Event category: Internal
Priority: Informational
Description: Media scan (scrub) resumed
Event specific codes: 0/0/0
Component type: Logical Drive
Component location: Logical Drive 1
Logged by: Controller in slot A

Raw data:
4d 45 4c 48 03 00 00 00 c7 7c 05 00 00 00 00 00
24 20 4d 00 f2 99 10 50 00 00 00 10 00 00 00 00
00 00 00 00 04 00 00 00 0d 00 00 00 0d 00 00 00
02 00 00 00 00 31 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 01 10 00 00 00 0c 00 13 06
01 00 00 00 00 00 00 00 00 40 4f 49


Date/Time: 12-7-26 9:14:26
Sequence number: 359622
Event type: 300D
Event category: Command
Priority: Informational
Description: Mode select for redundant controller page 2C received
Event specific codes: 0/0/0
Component type: Controller
Component location: Enclosure 85, Slot 1
Logged by: Controller in slot A

Raw data:
4d 45 4c 48 03 00 00 00 c6 7c 05 00 00 00 00 00
0d 30 38 00 f2 99 10 50 00 00 00 00 07 00 00 00
04 00 00 00 03 00 00 00 22 00 00 00 22 00 00 00
08 00 00 00 55 00 00 00 01 00 00 00 01 00 00 00
0a 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 04 7c 00 00 00 20 00 1a 06
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20 00 1a 86 00 02 00 00 14 00 81 81 81 81 81 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 20 00 1a 86 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 0a 00 1a 86 00 00 00 00
00 00 00 00 00 00 00 00


Date/Time: 12-7-26 9:14:25
Sequence number: 359621
Event type: 210A
Event category: Internal
Priority: Informational
Description: Controller cache not enabled or was internally disabled
Event specific codes: 0/0/0
Component type: Controller
Component location: Enclosure 85, Slot 1
Logged by: Controller in slot A

Raw data:
4d 45 4c 48 03 00 00 00 c5 7c 05 00 00 00 00 00
0a 21 48 00 f1 99 10 50 00 00 00 00 00 00 00 00
00 00 00 00 04 00 00 00 22 00 00 00 22 00 00 00
08 00 00 00 55 00 00 00 01 00 00 00 01 00 00 00
0a 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 00 00 00 00


Date/Time: 12-7-26 9:13:55
Sequence number: 359620
Event type: 2023
Event category: Internal
Priority: Informational
Description: Media scan (scrub) completed
Event specific codes: 0/0/0
Component type: Logical Drive
Component location: Logical Drive 1
Logged by: Controller in slot A

Raw data:
4d 45 4c 48 03 00 00 00 c4 7c 05 00 00 00 00 00
23 20 4d 00 d3 99 10 50 00 00 00 10 00 00 00 00
00 00 00 00 04 00 00 00 0d 00 00 00 0d 00 00 00
02 00 00 00 00 31 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 01 14 00 00 00 10 00 18 06
05 00 00 00 09 00 00 00 00 00 00 00 00 00 00 00



Date/Time: 12-7-26 9:13:53
Sequence number: 359619
Event type: 210A
Event category: Internal
Priority: Informational
Description: Controller cache not enabled or was internally disabled
Event specific codes: 0/0/0
Component type: Controller
Component location: Enclosure 85, Slot 1
Logged by: Controller in slot A

Raw data:
4d 45 4c 48 03 00 00 00 c3 7c 05 00 00 00 00 00
0a 21 48 00 d1 99 10 50 00 00 00 00 00 00 00 00
00 00 00 00 04 00 00 00 22 00 00 00 22 00 00 00
08 00 00 00 55 00 00 00 01 00 00 00 01 00 00 00
0a 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 00 00 00 00


flutter 发表于 2012-07-27 13:57

RAC重启,ORACLE官方给的几个主要原因:
1,心跳网络问题
2,时间不同步
3,负载过重

重启是/etc/init.d/init.cssd引发的,看看CRS的日志吧,会有详细的问题记录。
偶的RAC前几天也反复重启,1、2都碰到了。。。。三个节点莫名其妙就成了2012年、2010年、2000年
页: [1]
查看完整版本: RAC中三台中的两台会重启