- 论坛徽章:
- 0
|
小弟最近出差,干了个活,或许是幸运,或许是水平有限,把过程贴出来,给各位大大看看,多指点
奇遇,3310的恢复
硬件配置:sun880*2,各有72G*6;
sun StorEdge3310*1,单控制卡,有36G*6;
连接方式:单通道,双主机
raid做法:8,9,10,11,12做成raid5,13作hot spare
软件配置:Solaris 9
问题描述:3310阵列有蜂鸣报警,***-反复,LED正常,无黄灯亮;
2号主机不能访问阵列,有I/O ERROR;
1号主机对阵列的访问断断续续;
数据库不能读写。
处理过程:
1、安装sccli软件
1) # cp 2.0.0_sw_solaris-sparc.zip /tmp
2) #unzip 2.0.0_sw_solaris-sparc.zip
3) # cd solaris/sparc
4) #pkgadd -d . SUNWsscs
2、用sccli察看阵列情况
#sccli selected device /dev/rdsk/c2t0d0s2 [SUN StorEdge 3310 SN#080CD3]
sccli>
sccli> show logical-drives
LD LD-ID Size Assigned Type Disks Spare Failed Status
--------------------------------------------------------
ld0 0D839135 134.67GB Primary RAID5 3 0 1 Dead
sccli>
sccli> show disks
Ch Id Size Speed LD Status IDs Rev
--------------------------------------------------------
0 8 33.92GB 160MB ld0 ONLINE SEAGATE ST336607LSUN36G 0507
S/N 3JA72NYV00007429
0 9 33.92GB 160MB NONE USED SEAGATE ST336607LSUN36G 0507
S/N 3JAY46Q500007423
0 10 N/A N/A NONE BAD SEAGATE ST336607LSUN36G 0507
S/N 3JA1ESPC00007340
0 11 33.92GB 160MB ld0 ONLINE SEAGATE ST336607LSUN36G 0507
S/N 3JA1EF8J00007340
0 12 33.92GB 160MB ld0 ONLINE SEAGATE ST336607LSUN36G 0507
S/N 3JAY483500007325
0 13 33.92GB 160MB NONE USED HITACHI DK32EJ36NSUN36G PQ0B
S/N 49S1HVFA0040A3BR
sccli>
sccli> show events
Thu Jan 27 17:41:19 2005
[0181] #1: StorEdge Array SN#27556 Controller NOTICE: controller initialization completed
Tue Nov 29 13:48:36 2005
[1113] #2: StorEdge Array SN#27556 CH0 ID10: SCSI Drive ALERT: bad block encountered (02h, 03h,0C/00)
SCSI Status:0x02 Sense Key:0x03 Sense Code:0x0c Sense Code Qualifier:0x00
Tue Nov 29 13:48:42 2005
[1117] #3: StorEdge Array SN#27556 CH0 ID10: SCSI Drive ALERT: block successfully reassigned
SCSI Status:0x00 Sense Key:0x00 Sense Code:0x00 Sense Code Qualifier:0x00
Tue Nov 29 14:41:37 2005
[1113] #4: StorEdge Array SN#27556 CH0 ID10: SCSI Drive ALERT: bad block encountered (02h, 03h,0C/00)
SCSI Status:0x02 Sense Key:0x03 Sense Code:0x0c Sense Code Qualifier:0x00
Tue Nov 29 14:41:40 2005
[1116] #5: StorEdge Array SN#27556 CH0 ID10: SCSI Drive ALERT: block reassignment failed
SCSI Status:0x02 Sense Key:0x03 Sense Code:0x0c Sense Code Qualifier:0x00
Tue Nov 29 14:41:40 2005
[2101] #6: LD-ID 0D839135 on StorEdge Array SN#27556: ALERT: SCSI drive failure (CH0 ID10)
Thu Dec 1 12:03:18 2005
[110F] #7: StorEdge Array SN#27556 CH0: SCSI Drive Channel ALERT: SCSI bus reset issued
SCSI Status:0x00 Sense Key:0x00 Sense Code:0x00 Sense Code Qualifier:0x00
==========================
从上述情况来看,分析如下:
1) Ch0 ID 10 failed , need replace
2) Ch0 ID 9 , ID13 status " NONE USED " , raid 5 " DEAD "
我觉得9号和13号盘有修复的可能性,如能够修复,raid5还有希望,数据还有可能保存,就好比:干,九死一生,不干,十死无生。
3、修复
sccli> show ip-address
192.1.1.68
通过网线登陆telnet 192.1.1.68
选择VT100模式,ctl+L刷新屏幕
- go "view and edit SCSI Drives"
- select ID9 (NONE/USED)
- select "add Global spare"
- designate ID9 global spare
- go to "view and edit configuration parameters"
--> select Disk Array parameters
--> select rebuild priority and set it to "Normal" or "improved",
(when selecting "improved" , their I/O may beimpacted)
失败,ID9 BAD
失望中,我突然发现 ID10的状态居然变成了online
好机会阿,再试试ID13
- go "view and edit SCSI Drives"
- select ID13 (NONE/USED)
- select "add Global spare"
- designate ID13 global spare
- go to "view and edit configuration parameters"
--> select Disk Array parameters
--> select rebuild priority and set it to "Normal" or "improved",
(when selecting "improved" , their I/O may beimpacted)
ok,logical的状态变成了good,呵呵。
叫来了备件,为了安全起见,叫来了3块,把9,10,13都换掉了。 |
|