- 论坛徽章:
- 0
|
RAID1_HARD_DRIVE_Upgrade (last updated 20080516):
有空请支持下我的小论坛
www.austchinese.com
更新一下文档,
以下所有细节都实际实现完毕,前几天在中心机房用了3个多小时完成。
根据实际情况做了少许改动
只想做软件RAID1的话可以从 12 做起。
情况简介
---------------------------------
操作系统: Debian 4.0
CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz
Memory: 514252k/523712k available
---------------------------------
现在服务器上2块硬盘,软件做的RAID1,
目的是更换2块容量稍微大点硬盘, 80Gb 换到 160Gb.
操作系统,所有数据全部复制到新的硬盘,保持RAID1
以下例子,我们假设现在RAID1 上是 /dev/sda and /dev/sdb,
用来更新的硬盘是/dev/sdc and /dev/sdd
Scheme
------------------------------------------------------------------
Filesystem Size/upgraded to Used Avail Use% Mounted on Used for
/dev/md0 228M/228M 49M 194M 11% /
tmpfs 10M/10M 0 10M 0% /lib/init/rw
tmpfs 252/252M 0 252M 0% /dev/shm
/dev/md1 * 1.8G/3.7G 299M 1.4G 8% /usr
/dev/md3 * 3.7G/12G 1.9G 1.7G 14% /var log files
/dev/md4 15G/30G 76M 14G 1% /var/lib/mysql
/dev/md5 * 47G/137G 26G 20G 23% /data radius log, swap
tmpfs 252M/252M 0 252M 0% /tmp
/dev/md2 906M/906M none
---------------------------------
Equipments needed:
160GB SCSI Hard Drive *2
---------------------------------
MAIN STEPS
------------------------------------------------------------------
1. Remove the sdb from RAID1 array.
2. Shutdown the box, physically remove the sdb drive.
3. Install a new drive(assuming the new drive is sdc, 160GB)
4. Start the box
5. Partition and format sdc
6. Copy the files from RAID device to sdc
7. Setup boot rules for sdc
8. Shutdown and power off the box
9. Remove sda
10. Start the box, boot from sdc
11. If boot success, poweroff box, install sdd, start box
12. Create and format new RAID1 devices with sdd
13. Copy files from sdc to RAID devices
14 Setup boot rules for RAID device
15 boot from RAID, reformat sdc, add sdc to RAID1
16 Modify LILO.conf
17. Reboot
------------------------------------------------------------------
1. Remove the sdb from the array:
>> mdadm --set-faulty /dev/md0 /dev/sdb1
>> mdadm --remove /dev/md0 /dev/sdb1
>> mdadm --set-faulty /dev/md1 /dev/sdb2
>> mdadm --remove /dev/md1 /dev/sdb2
>> mdadm --set-faulty /dev/md2 /dev/sdb3
>> mdadm --remove /dev/md2 /dev/sdb3
>> mdadm --set-faulty /dev/md3 /dev/sdb5
>> mdadm --remove /dev/md3 /dev/sdb5
>> mdadm --set-faulty /dev/md4 /dev/sdb6
>> mdadm --remove /dev/md4 /dev/sdb6
>> mdadm --set-faulty /dev/md5 /dev/sdb7
>> mdadm --remove /dev/md5 /dev/sdb7
>> mdadm --set-faulty /dev/md6 /dev/sdb8
>> mdadm --remove /dev/md6 /dev/sdb8
----------------
2. Shutdown the box, physically remove the sdb drive.
----------------
3. Install a new drive(assuming the new drive is sdc, 160GB)
实际操作中,会把sdc插到sdb的那个口,那么在系统中会被读成/dev/sdb,
以下文档中还是使用sdc,
方便大家知道是在改哪个盘的设置
实际操作中,应该是用sdb
----------------
4. Start the box
----------------
5. partition and format sdc
# partition as below:
# Set up the partition ID signature on the new disk /dev/sdc:
>> fdisk /dev/sdc
Device boot System Size mounted on FS type
/dev/sdC1 * PRIMARY 228M / 83
/dev/sdC2 PRIMARY 3.7G /usr 83
/dev/sdC3 Extended 179.9G 5
/dev/sdC5 LOGICAL 906M none(swap) 82
/dev/sdC6 LOGICAL 12G /var 83
/dev/sdC7 LOGICAL 5G /var/lib/mysql 83
/dev/sdC8 LOGICAL 137G /data 83
#check point
>> cfdisk /dev/sdc
cfdisk, 可以理解成一个图形化界面的fdisk,用这个大概检查一下设置,
确认我们的fdisk这步没有做错什么
# then format the new drive
>> mkfs.ext3 /dev/sdc1
>> mkfs.ext3 /dev/sdc2
>> mkfs.ext3 /dev/sdc3
//>> mkswap /dev/sdc5 #removed cause it will be loaded by
//>> swapon -a #/newdisk/etc/fstab below
>> mkfs.ext3 /dev/sdc6
>> mkfs.ext3 /dev/sdc7
>> mkfs.ext3 /dev/sdc8
# mount the new drive
mkdir /newdisk
mount -t ex3 /dev/sdc1 /newdisk
mkdir /newdisk/usr
mount -t ex3 /dev/sdc2 /newdisk/usr
mkdir /newdisk/var
mount -t ex3 /dev/sdc6 /newdisk/var
mkdir /newdisk/var/lib
mkdir /newdisk/var/lib/mysql
mount -t ex3 /dev/sdc7 /newdisk/var/lib/mysql
mkdir /newdisk/data
mount -t ex3 /dev/sdc8 /newdisk/data
这里请不要一次性mkdir 所有目录,然后再一起mount,
为什么请读mount原理的基础介绍
以上顺序可以修改,但请确认自己知道什么改动不会有影响
----------------
6. Copy the files from RAID device to sdc
# shut down the system daemons, prevent login:
/sbin/telinit 1
# copy the files from sda to sdc (updated as below)
Goes to the root directory and then copies all files and directories
>>rsync -avHW --progress --exclude '/newdisk/' --include 'proc/' --exclude '/proc/*' \
--include 'sys/' --exclude '/sys/*' / /newdisk
这里要说下 rsync, 我们不复制 /sys/* 和 /proc/* 因为这两个目录是虚拟文件系统,
文件会在系统启动后即时生成的,我们只把 /sys /proc 这2个空目录复制到新硬盘上,
当然,我们不复制 /newdisk 本身, 所以exclude /newdisk.
按回车前请检查一下,这个命令一般会花上不少时间,我们不能在这出问题
到这里可以享受一下准备好的咖啡什么的了。30GB的数据大概是30分钟左右
相关rsync 请 man rsync, 可能根据具体情况,会有很多东西你不想复制的
# Check point: compare the files, make sure only copied the directory /sys /proc
----------------
7. Setup boot rules for sdc
# Configure the new /etc/fstab and LILO
#7.1 Configure the /newdisk/etc/fstab
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/sda1 * / ext3 defaults,errors=remount-ro 0 1
/dev/sda2 /usr ext3 defaults 0 2
;/dev/sda3 Extended
/dev/sda5 none swaq sw 0 0
/dev/sda6 /var ext3 defaults 0 2
/dev/sda7 /var/lib/mysql ext3 defaults,noatime,nodiratime 0 2
/dev/sda8 /data ext3 defaults, 0 2
proc /proc proc defaults 0 0
tmpfs /tmp tmpfs rw 0 0
/dev/hda /media/cdrom0 iso9660 ro,user,noauto 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto 0 0
这里写成sda是因为以下的启动过程中,sdc会被读成sda
#7.2 Install LILO to the new disk (sdc)
# Edit /newdisk/etc/lilo.conf
# find the following lines and change them to the one below
# boot=/dev/md0
boot=/dev/sdc
这里定义我们是在写哪个硬盘的MBR
#root=/dev/md1
root=/dev/sda1
这里定义 /dev/sda will be mounted as /
#install=menu
install=/newdisk/boot/boot-menu.b
#map=/boot/map
map=/newdisk/boot/map
#image=/boot/vmlinuz
image=/newdisk/boot/vmlinuz
#image=/boot/vmlinuz.old
image=/newdisk/boot/vmlinuz.old
#image=/boot/vmlinuz.ancient
image=/newdisk/boot/vmlinuz.ancient
#image=/boot/vmlinuz.stable
image=/newdisk/boot/vmlinuz.stable
#image=/boot/vmlinuz.test
image=/newdisk/boot/vmlinuz.test
#Install LILO on the new disk
>> /sbin/lilo -C /newdisk/etc/lilo.conf
会有warning message, sdc is not the first disk,
image=/newdisk/boot/vmlinuz written/loaded, etc
这些很正常,忽略。
还是强调一下。实际操作中。需要把sdc换成sdb(对应的硬盘)
----------------
8. Shutdown and power off the box
----------------
9. Phsically remove sda
----------------
10. Start the box, boot from sdc
这里注意一下,需要把sdc插到原先sda的地方,物理上需要让服务器把sdc读成sda
这里就是从新硬盘启动了
----------------
11. If success, poweroff box, install sdd, start box
>>fdisk -l /dev/sda 检查一下
Device Boot Start End Blocks Id System
/dev/sda1 * 1 34 273073+ 83 Linux
/dev/sda2 35 30401 243922927+ 5 Extended
/dev/sda5 35 642 4883728+ 83 Linux
/dev/sda6 643 1007 2931831 83 Linux
/dev/sda7 1008 1337 2650693+ 82 Linux swap / Solaris
/dev/sda8 1338 1386 393561 83 Linux
/dev/sda9 1387 30401 233062956 83 Linux
----------------
12. Create and format new RAID1 devices with sdd
实际中你的sdd 应该是被读为 /dev/sdb
# copy partition schema from sdc to sdd
>>sfdisk -d /dev/sdc | sfdisk /dev/sdd
>>cfdisk /dev/sdd
# modify FS Id of sdd to 'fd', Write and quit
# reboot
# Create and format new RAID1 devices
>> mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/sdd1 missing
这一步注意一下, /dev/sdd1 missing 的顺序不要弄错
我以前写的 /dev/sdd1 missing, 用LILO的时候会有错,下面会提到
>> mkfs.ext3 /dev/md0
...
...
>> mdadm --create /dev/md6 --level=1 --raid-disks=2 /dev/sdd8 missing
>> mkfs.ext3 /dev/md6
# Check RAID stat
>>cat /proc/mdstat
----------------
13. Copy files from sdc to RAID devices
照旧
>> mkdir /mnt
>> mount /dev/md0 /mnt
>> mkdir /mnt/usr
>> mount /dev/md1 /mnt/usr
>> mkdir /mnt/var
>> mount /dev/md3 /mnt/var
>> mkdir /mnt/var/lib
>> mkdir /mnt/var/lib/mysql
>> mount /dev/md4 /mnt/var/lib/mysql
>> mkdir /mnt/data
>> mount /dev/md5 /mnt/data
>> mkswap /dev/md2
>> swapon -a
# Start to copy files
>>rsync -avHW --progress --exclude 'mnt/' --include 'sys/' \
--exclude '/sys/*' --include 'proc/' --exclude '/proc/*' / /mnt
回车前请检查一下,接着又是咖啡时间,可以用这些时间把自己的旧硬盘装好,
这里没出问题的话,基本上不会用到了
----------------
14. Setup boot rules for RAID device
# 14.1 change the /mnt/etc/fstab and LILO to boot from md devices
# /mnt/etc/fstab
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/md0 / ext3 defaults,errors=remount-ro 0 1
/dev/md1 /usr ext3 defaults 0 2
/dev/md3 /var ext3 defaults 0 2
/dev/md4 /var/lib/mysql ext3 defaults,noatime,nodiratime 0 2
/dev/md5 /data ext3 defaults, 0 2
/dev/md2 none swap sw 0 0
proc /proc proc defaults 0 0
tmpfs /tmp tmpfs rw 0 0
/dev/hda /media/cdrom0 iso9660 ro,user,noauto 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto 0 0
# 14.2 Edit /mnt/etc/lilo.conf
# find the following lines and change them to the one below
# boot=/dev/sdc1
# boot=/dev/md0
boot /dev/sdd
实际上你可能需要用sdb
#root=/dev/sdc2
root=/dev/md0
#install=/newdisk/boot/boot-menu.b
install=/mnt/boot/boot-menu.b
#map=/newdisk/boot/map
map=/mnt/boot/map
#image=/newdisk/boot/vmlinuz
image=/mnt/boot/vmlinuz
#image=/newdisk/boot/vmlinuz.old
image=/mnt/boot/vmlinuz.old
#image=/newdisk/boot/vmlinuz.ancient
image=/mnt/boot/vmlinuz.ancient
#image=/newdisk/boot/vmlinuz.stable
image=/mnt/boot/vmlinuz.stable
#image=/newdisk/boot/vmlinuz.test
image=/mnt/boot/vmlinuz.test
#Install LILO on the RAID device
>> /sbin/lilo -C /mnt/etc/lilo.conf
如果这里你碰到:
lilo Fatal: Trying to map files from unnamed device 0x0000 (NFS/RAID mirror down ?)
请返回上面看看建立RAID1的设备顺序
(详细原因如下:
This may mean two things:
* RAID is being rebuilt - check it with cat /proc/mdstat,
and try again when it's finished.
* Another is that the first device in the RAID array doesn't exist,
such as when building a degraded array with only one device.
If you stop the array and reassemble it so that the active device is first,
lilo should start working again.
)
# reboot (boot from RAID device)
----------------
15. boot from RAID, if success, reformat sdc, add sdc to RAID1
>>cfdisk /dev/sdc
# modify file system type to 'fd', write and quit,
# reboot to load the change above
>> mdadm --add /dev/md0 /dev/sdc1
>> mdadm --add /dev/md1 /dev/sdc2
>> mdadm --add /dev/md2 /dev/sdc5
>> mdadm --add /dev/md3 /dev/sdc6
>> mdadm --add /dev/md4 /dev/sdc7
>> mdadm --add /dev/md5 /dev/sdc8
回车前请细心, 同步的速度大约是50MB/S,160GB的大概要花54分钟
听听歌,吃点东西什么的吧。到这里基本上已经没什么大问题了
# monitor /proc/mdstat until everything is synched
----------------
16. Write MBR to sdc
>>dd if=/dev/sdd of=/tmp/mbr_raid1_backup bs=512 count=1 #backup MBR
>>dd if=/tmp/mbr_raid1_backup of=/dev/sdc bs=512 count=1 #write MBR to sdc
----------------
17. Reboot
>>/usr/bin/free
check total Mem, Swap
if swap is showing 0,
>> mkswap /dev/md2
>> swapon -a
检查一下以前的服务都还在运行不,打几个电话问问从外面能访问,使用不.
多查查再走
----------------
写在后面的,
说句实话,我刚接触unix,
整个过程在网上搜了很久,测试,失败,再测试。
万一有人要做类似的事,希望能有帮助
我强烈推荐读一下启动过程,理解一下MBR,LILO这些关键词
这个升级我计划了很久,自己买了设备测试。最后在中心机房一次成功。
我想我是做不了管理员的,我很不正经,嘿嘿
在这里问候一下安排这个任务的头头的母亲。
因为你我都知道,有很多更加简单省时的方法可以完成这个升级,
你也没给过我技术和设备上的支持。
永远也忘不了14/05,给了我个破access card, 一串密码,
就把我这从来没接触过这方面的新人,扔到机房去了.
不好意思,没让您老看到笑话,
我不是愤青,但我还是想说,别小看中国人
我们想做,就能做成.
有空请支持下我的小论坛
www.austchinese.com (server又down...估计家里网线又松了。。无奈,今天晚上8点恢复) |
|