免费注册	查看新帖 \|


平台论坛博客文库

› 论坛 › 操作系统 › Linux系统管理 › 如何让软RAID在出故障后重新运行起来

1 2 345 6 7 8 9 10 ... 11 / 11 页下一页

最近访问板块

发新帖

楼主: Boson

上一主题

下一主题

如何让软RAID在出故障后重新运行起来 [复制链接]

论坛徽章:: 0

31楼 [报告]

发表于 2006-02-09 13:34 |只看该作者

原帖由 platinum 于 2006-2-9 12:22 发表
我做了一个试验
三块硬盘（完好状态）拿掉第二块，可以启动
三块硬盘（完好状态）那调第一块，VMWARE 报错
　　　　　　　　　　把第二块换到第一块的位置，VMWARE 可以运行，但系统无法启动
　　　　　　　　 ...

以下是我的RAID配置文件，
[root@appfs /]# cat /etc/raidtab
raiddev /dev/md0
raid-level             5
nr-raid-disks          3
nr-spare-disks       0
persistent-superblock 1
parity-algorithm       left-symmetric
chunk-size 32

device                /dev/hdb1
raid-disk             0
device                /dev/hdc1
raid-disk             1
device                /dev/hdd1
raid-disk             2

以下是我的RAID状态，
[root@appfs /]# cat /proc/mdstat
Personalities :
read_ahead not set
unused devices: <none>

以下是我运行raidstart的信息，
[root@appfs /]# raidstart /dev/md0
/dev/md0: Invalid argument

以下是我运行raidhotadd 的出错信息，
[root@appfs /]# raidhotadd /dev/md0 /dev/hdb1
/dev/md0: can not hot-add disk: array not running!

所以我发现，只要能让我的md0跑起来，才能换硬盘了，

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

32楼 [报告]

发表于 2006-02-09 13:39 |只看该作者

原帖由 Boson 于 2006-2-9 13:34 发表
[root@appfs /]# raidstart /dev/md0
/dev/md0: Invalid argument

你一定没看 raidstart 的使用说明吧？
用法不对，当然报告参数错误了

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

33楼 [报告]

发表于 2006-02-09 13:44 |只看该作者

原帖由 platinum 于 2006-2-9 12:22 发表
我做了一个试验
三块硬盘（完好状态）拿掉第二块，可以启动
三块硬盘（完好状态）那调第一块，VMWARE 报错
　　　　　　　　　　把第二块换到第一块的位置，VMWARE 可以运行，但系统无法启动
　　　　　　　　 ...

raid5是会挑次序的

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

34楼 [报告]

发表于 2006-02-09 13:59 |只看该作者

原帖由 platinum 于 2006-2-9 13:39 发表

你一定没看 raidstart 的使用说明吧？
用法不对，当然报告参数错误了

我的raidstart /dev/md0哪里不对啊，我一直是这样启动的，

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

35楼 [报告]

发表于 2006-02-09 14:01 |只看该作者

原帖由 bingosek 于 2006-2-9 13:44 发表

raid5是会挑次序的

我没有更改过次序，坏之前是如何排列，我现在就如何替换，我只是把坏的插到新的那个位置上了，

难道是新的要插到最后一个吗？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

36楼 [报告]

发表于 2006-02-09 14:03 |只看该作者

原帖由 Boson 于 2006-2-9 14:01 发表

我没有更改过次序，坏之前是如何排列，我现在就如何替换，我只是把坏的插到新的那个位置上了，

难道是新的要插到最后一个吗？

硬盘跳线也一致吗？
另外看看这个 http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

37楼 [报告]

发表于 2006-02-09 14:07 |只看该作者

原帖由 platinum 于 2006-2-9 14:03 发表

硬盘跳线也一致吗？
另外看看这个 http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/

坏盘与新盘的跳线完全一致，

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

38楼 [报告]

发表于 2006-02-09 14:11 |只看该作者

原帖由 Boson 于 2006-2-9 14:07 发表

坏盘与新盘的跳线完全一致，

在系统中是否都是认出与原来相同的设备号？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

39楼 [报告]

发表于 2006-02-09 14:18 |只看该作者

原帖由 bingosek 于 2006-2-9 14:11 发表

在系统中是否都是认出与原来相同的设备号？

是的，坏掉的是/dev/hdb1，换上新硬盘后认到的是/dev/hdb，对此新盘分区后得到/dev/hdb1

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

论坛徽章:: 0

40楼 [报告]

发表于 2006-02-09 14:20 |只看该作者

8.1 Recovery from a multiple disk failure
The scenario is:

A controller dies and takes two disks offline at the same time,
All disks on one scsi bus can no longer be reached if a disk dies,
A cable comes loose...
In short: quite often you get a temporary failure of several disks at once; afterwards the RAID superblocks are out of sync and you can no longer init your RAID array.

If using mdadm, you could first try to run:

mdadm --assemble --force

If not, there's one thing left: rewrite the RAID superblocks by mkraid --force

To get this to work, you'll need to have an up to date /etc/raidtab - if it doesn't EXACTLY match devices and ordering of the original disks this will not work as expected, but will most likely completely obliterate whatever data you used to have on your disks.

Look at the sylog produced by trying to start the array, you'll see the event count for each superblock; usually it's best to leave out the disk with the lowest event count, i.e the oldest one.

If you mkraid without failed-disk, the recovery thread will kick in immediately and start rebuilding the parity blocks - not necessarily what you want at that moment.

With failed-disk you can specify exactly which disks you want to be active and perhaps try different combinations for best results. BTW, only mount the filesystem read-only while trying this out... This has been successfully used by at least two guys I've been in contact with.

我总觉得通过重写superblock是一种办法，但不知道如何实施，大家觉得此方法行得通吗？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

1 2 345 6 7 8 9 10 ... 11 / 11 页下一页

发新帖

Chinaunix › 论坛 › 操作系统 › Linux系统管理 › 如何让软RAID在出故障后重新运行起来

北京盛拓优讯信息技术有限公司. 版权所有京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号：11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员联系我们：huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP