shreychen 发表于 2013-05-08 14:20

smartmontools检测到硬盘的错误日志之后如何处理

hi,检测到好几台服务器的硬盘有错误日志,类似下面的这种,基本看不到,也没有google到什么有用的信息,所以来这里请教下各路大神:

1.这些错误日志具体什么意思(网上说这些日志基本没什么用,只是预警硬盘可能会有问题,但无法断言)
2.这些错误可以修复吗?有什么好的工具没有?(现在是用nagios监控的,一直报警挺烦的)
3.有什么别的建议也请指教。

root@test:~# smartctl -l error /dev/sda
smartctl 5.40 2010-07-12 r3124 (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 8 (device log contains only the most recent five errors)
      CR = Command Register
      FR = Features Register
      SC = Sector Count Register
      SN = Sector Number Register
      CL = Cylinder Low Register
      CH = Cylinder High Register
      DH = Device/Head Register
      DC = Device Command Register
      ER = Error register
      ST = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 8 occurred at disk power-on lifetime: 3400 hours (141 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
01 51 00 51 80 46 e1Error: AMNF at LBA = 0x01468051 = 21397585

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_TimeCommand/Feature_Name
-- -- -- -- -- -- -- --------------------------------------
c8 00 00 80 7f 46 e1 08      00:32:18.452READ DMA
ec 00 00 00 00 00 a0 08      00:32:18.436IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08      00:32:18.436SET FEATURES

Error 7 occurred at disk power-on lifetime: 3400 hours (141 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
01 51 00 51 80 46 e1Error: AMNF at LBA = 0x01468051 = 21397585

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_TimeCommand/Feature_Name
-- -- -- -- -- -- -- --------------------------------------
c8 00 00 80 7f 46 e1 08      00:32:16.469READ DMA
c8 00 00 80 7a 46 e1 08      00:32:16.451READ DMA
c8 00 e0 a0 79 46 e1 08      00:32:16.450READ DMA

Error 6 occurred at disk power-on lifetime: 3216 hours (134 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7b 1f 48 e1Error: UNC at LBA = 0x01481f7b = 21503867

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_TimeCommand/Feature_Name
-- -- -- -- -- -- -- --------------------------------------
c8 00 00 d0 1e 48 e1 08      00:13:53.199READ DMA
ec 00 00 00 00 00 a0 08      00:13:53.183IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08      00:13:53.183SET FEATURES

Error 5 occurred at disk power-on lifetime: 3216 hours (134 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7b 1f 48 e1Error: UNC at LBA = 0x01481f7b = 21503867

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_TimeCommand/Feature_Name
-- -- -- -- -- -- -- --------------------------------------
c8 00 00 d0 1e 48 e1 08      00:13:51.465READ DMA
ec 00 00 00 00 00 a0 08      00:13:51.449IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08      00:13:51.449SET FEATURES

Error 4 occurred at disk power-on lifetime: 3216 hours (134 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7b 1f 48 e1Error: UNC at LBA = 0x01481f7b = 21503867

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_TimeCommand/Feature_Name
-- -- -- -- -- -- -- --------------------------------------
c8 00 00 d0 1e 48 e1 08      00:13:49.734READ DMA
ec 00 00 00 00 00 a0 08      00:13:49.718IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08      00:13:49.718SET FEATURES

shreychen 发表于 2013-05-08 17:34

我初步处理了下,改了下监控脚本,当出现大量error的时候报警。

midhu 发表于 2013-05-21 14:25

回复 1# shreychen


    看上去硬盘是有坏道了,机器性能没有影响吗?

shreychen 发表于 2013-05-24 17:18

回复 3# midhu


硬盘已下线修复了。现在不用这个脚本来检测了,改用smartd了。
   

lhhpassion 发表于 2013-05-27 19:52

楼主你怎么修复的啊..跪求...还有这个到底是什么错误啊..我也遇到了..一样啊

lhhpassion 发表于 2013-05-29 23:07

没后续??????求解答啊.....!!!!!!!!!

oyqiaojin 发表于 2013-06-27 15:09

同求                        
页: [1]
查看完整版本: smartmontools检测到硬盘的错误日志之后如何处理