- 论坛徽章:
- 0
|
客户有几台SUN V245,今天下午有一台宕机了。到机房一看告警灯常亮,ok灯绿色闪烁。连接串口无任何输出,没办法只能重启。
重启以后sc中一切正常
sc> showenvironment
=============== Environmental Status ===============
--------------------------------------------------------------------------------
System Temperatures (Temperatures in Celsius):
--------------------------------------------------------------------------------
Sensor Status Temp LowHard LowSoft LowWarn HighWarn HighSoft HighHard
--------------------------------------------------------------------------------
MB.P0.T_CORE OK 73 -15 -10 0 100 105 110
MB.P1.T_CORE OK 77 -15 -10 0 100 105 110
MB.T_REMOTE OK 38 -- -- -- -- -- --
MB.T_1064 OK 68 -15 -10 0 105 110 115
MB.T_FIRE OK 47 -15 -10 0 95 105 108
MB.T_AMB OK 42 -15 -10 0 65 75 85
FIOB.T_AMB OK 22 -15 -10 0 45 47 50
PDB.T_DISK OK 30 -15 -10 0 55 65 70
PDB.T_PS0 OK 27 -15 -10 0 48 50 53
PDB.T_PS1 OK 28 -15 -10 0 48 50 53
--------------------------------------
Keyswitch:
--------------------------------------
Keyswitch position: NORMAL
--------------------------------------------------------
System Indicator Status:
--------------------------------------------------------
SYS.LOCATE SYS.SERVICE SYS.ACT
--------------------------------------------------------
OFF OFF ON
--------------------------------------------------------
SYS.PSFAIL SYS.OVERTEMP SYS.FANFAIL
--------------------------------------------------------
OFF OFF OFF
--------------------------------------------
System Disks:
--------------------------------------------
Disk Status Service OK2RM
--------------------------------------------
HDD0 OK OFF OFF
HDD1 OK OFF OFF
HDD2 OK OFF OFF
HDD3 OK OFF OFF
----------------------------------------------------------
Fans (Speeds Revolution Per Minute):
----------------------------------------------------------
Sensor Status Speed Warn Low
----------------------------------------------------------
PDB.HDDFB.FT6.F0 OK 10505 -- 8000
PDB.HDDFB.FT6.F1 OK 10714 -- 8000
FT0.F0 OK 3879 -- 2022
FT1.F0 OK 3924 -- 2022
FT2.F0 OK 3879 -- 2022
FT3.F0 OK 4066 -- 2022
FT4.F0 OK 3970 -- 2022
FT5.F0 OK 3924 -- 2022
--------------------------------------------------------------------------------
Voltage sensors (in Volts):
--------------------------------------------------------------------------------
Sensor Status Voltage LowSoft LowWarn HighWarn HighSoft
--------------------------------------------------------------------------------
MB.P0.V_CORE OK 1.45 1.21 1.23 1.57 1.60
MB.P1.V_CORE OK 1.48 1.21 1.23 1.57 1.60
MB.V_+3V3 OK 3.31 2.48 2.48 3.49 3.59
MB.V_+12V OK 12.10 9.04 9.04 12.96 13.56
MB.BAT.V_BAT OK 3.21 2.26 2.26 3.51 3.60
--------------------------------------------
Power Supply Indicators:
--------------------------------------------
Supply DC-OK AC-OK Service
--------------------------------------------
PS0 ON ON OFF
PS1 ON ON OFF
------------------------------------------------------------------------------
Power Supplies:
------------------------------------------------------------------------------
Supply Status Underspeed Overtemp Overvolt Undervolt Overcurrent
------------------------------------------------------------------------------
PS0 OK OFF OFF OFF OFF OFF
PS1 OK OFF OFF OFF OFF OFF
sc> showlogs
Log entries since FEB 14 03:25:31
----------------------------------
FEB 14 03:25:31 Portal2: 00060000: "SC Login: User admin Logged on."
FEB 14 03:25:31 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:01:57 Portal2: 00060002: "SC Login: User admin Logged out."
FEB 14 04:01:57 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:02:14 Portal2: 00060000: "SC Login: User admin Logged on."
FEB 14 04:02:14 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:08:29 Portal2: 00040001: "SC Request to Power On Host."
FEB 14 04:08:29 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:08:30 Portal2: 00040002: "Host System has Reset"
FEB 14 04:08:30 Portal2: 0004000b: "Host System has read and cleared bootmode."
FEB 14 04:08:33 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:08:33 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:09:18 Portal2: 0004004f: "Indicator PS0.DC_OK is now ON"
FEB 14 04:09:18 Portal2: 0004004f: "Indicator PS1.DC_OK is now ON"
FEB 14 04:09:18 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:09:18 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:10:46 Portal2: 00040002: "Host System has Reset"
FEB 14 04:10:46 Portal2: 00060007: "Failed to send email alert for recent event."
FEB 14 04:11:38 Portal2: 0004000b: "Host System has read and cleared bootmode."
FEB 14 04:11:39 Portal2: 00060007: "Failed to send email alert for recent event."
poweron后正常进入操作系统,查看/var/adm/messages只有这次启动的日志记录,/var/crach/Portal下面也没有dump。
只有在last reboot中有系统宕机的时间点。
reboot system down SUN Aug 5 01:15
问题1:为什么没有宕机的记录啊?大侠给分析分析,或者给个查问题的方向或者方法。
本想收集exploer的,但是收集的时候一直停在disks running
最后发现raidctl的输出没完没了了
root@Portal2 # format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c1t0d0 <LSILOGIC-LogicalVolume-3000 cyl 65533 alt 2 hd 16 sec 136>
/pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@0,0
1. c1t2d0 <LSILOGIC-LogicalVolume-3000 cyl 65533 alt 2 hd 16 sec 136>
/pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@2,0
Specify disk (enter its number): ^C
root@Portal2 #
root@Portal2 # raidctl -l c1t0d0
Volume Size Stripe Status Cache RAID
Sub Size Level
Disk
----------------------------------------------------------------
c1t0d0 68.3G N/A OPTIMAL N/A RAID1
0.3.0 68.3G GOOD
0.2.0 68.3G GOOD
root@Portal2 # raidctl -l c1t2d0
Volume Size Stripe Status Cache RAID
Sub Size Level
Disk
----------------------------------------------------------------
c1t2d0 68.3G N/A OPTIMAL N/A RAID1
0.3.0 68.3G GOOD
0.2.0 68.3G GOOD
root@Portal2 #
root@Portal2 # raidctl -l
Controller: 1
Volume:c1t0d0
Volume:c1t2d0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
Disk: 0.2.0
^Croot@Portal2 #
这里能等10分钟就一直是这样,真是搞不清状况了,神马情况啊!
问题2:系统4块硬盘,做了两个硬raid1.raidctl输出没完没了,按道理卷下面就是四块物理盘就完了。而且raidctl -l c1t0d0和raidctl -l c1t2d0中显示的disk都是一样的,越看越看不懂啊。
以上两个问题请大侠指点迷津!!! |
|