免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 1753 | 回复: 0
打印 上一主题 下一主题

Tutorial on DiskSuite [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2005-09-21 13:02 |只看该作者 |倒序浏览

Tutorial on DiskSuite
Solstice DiskSuite 4.2.1
General Info:
- Requires Solaris 8 to run. Also ships with Solaris8. Located on S/W CD2 at:
/cdrom/cdrom0/Solaris_8/EA/products/DiskSuite_4.2.1
- Incompatible with previous versions of DiskSuite. If upgrading => have to reconfigure.
- DiskSuite / Solaris compatibility matrix:
- Solaris 2.5.1 -> DS 4.0, 4.1
- Solaris 2.6 -> DS 4.1, 4.2
- Solaris 7 -> DS 4.2
- Solaris 8 -> DS 4.2.1
Installation:
1. Mount Solaris 8 S/W CD2 (if remote: put "share -F nfs -o ro -d "CDROM" /cdrom/cdrom0" in /etc/dfs/dfstab; shareall; then mount from the remote location)
2. cd /cdrom/cdrom0/Solaris_8/EA/products/DiskSuite_4.2.1/i386/Packages
3. pkgadd -d .
- Administrate DiskSuite either via the DiskSuite tool (metatool) or command line.
Terms:
- metadevice (simple, mirror, RAID5, trans) - a group of physical devices that appears as a single logical device.
- metadevice state database - contains vital information - has multiple replicas - all exist on a separate slice or a slice used in a metadevice.
- hot spare pool - collection of hot spares waiting to be used in case of a failure.
Config files:
- /etc/lvm/mddb.cf - state database replicas
- /etc/lvm/md.tab - input file for "metainit", "metadb", and "metahs"
- /etc/lvm/md.cf - a backup of the current configuration - useful when recovering from a crash.
- /kernel/drv/md.conf - number of metadevices and disksets
- /etc/lvm/mdlogd.cf - SNMP trap generating daemon
- /etc/rcS.d/S35lvm.init - metadevice configuration at boot
- /etc/rc2.d/S95lvm.sync - automatic resyncing of metadevices
/etc/lvm/md.tab examples:
- 3 state database replicas on each slice:
mddb01 -c 3 c0t1d0s0 c0t2d0s0 c0t3d0s0
- single stripe consisting of 2 disks, with 32k interlace value (default is 16k):
d15 1 2 c0t1d0s2 c0t2d0s2 -i 32k
- concatenation of 4 stripes, each consisting of 1 disk:
d7 4 1 c0t1d0s0 1 c0t2d0s0 1 c0t3d0s0 1 c0t4d0s0
- concatenation of 2 stripes, each made of 3 disks:
d75 2 3 c0t1d0s2 c0t2d0s2 c0t3d0s2 -i 16k 3 c1t1d0s2 c1t2d0s2 c1t3d0s2 -i 32k
- mirror (d50) containing 1 submirror (d51). The other 2 mirrors must be attached later with "metattach"
d50 -m d51
d51 1 1 c0t1d0s2
d52 1 1 c0t2d0s2
d53 1 1 c0t3d0s2
- trans metadevice d1, containing a master device d10 and a logging device d20 (both mirrors).
Mirrors d12 and d22 are attached later via "metattach"
d1 -t d10 d20
d10 -m d11
d11 1 1 c0t1d0s2
d12 1 1 c0t2d0s2
d20 -m d21
d21 1 1 c1t1d0s2
d22 1 1 c1t2d0s2
- RAID5 metadevice of 3 slices, using an interlace of 20K:
d80 -r c0t1d0s1 c1t0d0s1 c2t0d0s1 -i 20k
- hot spare pool -- 3 hot spare pools are associated to 3 mirrors:
d10 -m d20
d20 1 1 c1t0d0s2 -h hsp001
d30 1 1 c2t0d0s2 -h hsp002
d40 1 1 c3t0d0s2 -h hsp003
hsp001 c2t2d0s2 c3t2d0s2 c1t2d0s2
hsp002 c3t2d0s2 c1t2d0s2 c2t2d0s2
hsp003 c1t2d0s2 c2t2d0s2 c3t2d0s2
[color="#ff4400"]The following general section will also be featured as a separate RAID topic, as it contains valuable advice when selecting a particular RAID level.
General configuration notes:
1. Striping has the best performance, but offers no data protection.
2. For write intensive applications, mirroring has better performance than RAID5.
3. Mirroring and RAID5 both increase data availability, and both decrease writing performance.
4. Mirroring improves random read performance.
5. RAID5 has lower cost than mirroring. Stripes/concatenations have no additional cost.
Concatenation notes:
1. Concatenation uses less CPU time than striping.
2. Concatenation works well for small random I/O.
3. Avoid using physical disks with different geometries.
4. Distribute slices across different controllers and busses to help balance the I/O load.
Striping (RAID0) notes:
1. Set the stripe's interlace value correctly.
2. The more physical disks in a stripe, the greater the I/O performance, and the lower the MTBF (mean time b/n failures).
3. Don't mix differently sized slices, as a stripe's size is limited by its smallest slice.
4. Avoid using physical disks with different geometries.
5. Distribute the stiped metadevice across different controllers and busses.
6. Striping cannot be used to encapsulate existing filesystems.
7. Striping performs well for large sequential I/O and for random I/O distributions.
8. Striping uses more CPU cycles than concatenation, but it is usually worth it.
9. Striping does not provide any redundancy of data.
Mirroring (RAID1) notes:
1. Mirroring may improve read performance; write performance is always degraded.
2. Mirroring improves read performance only in multi-threaded or asynchronous I/O situations.
3. Mirroring degrades write performance by about 15-50 percent, as it has to write everything twice.
4. Using filesystem cache may turn a 80/20 read/write situation to 60/40 or even 40/60.
RAID5 notes:
1. RAID5 can withstand only a single device failure (mirroring MAY withstand several; striping and concatenation leave no room for that).
2. RAID5 provides good read performance under no errors, and poor read performance under error conditions.
3. RAID5 can cause poor write performance -- up to 70 percent degradation (as parity has to be calculated on the fly).
4. RAID5 is much cheaper than mirroring. Amount of disks needed for parity = 1 / total_#_disks.
5. RAID5 can NOT be used for existing filesystems. A backup and restore will be necessary.
Logging device notes:
1. Place them on an unused disk, preferrably around the middle (to minimize the average seek).
2. The log device and the master device of the same trans metadevice should be located on different drives/controllers to balance the I/O load.
3. Trans metadevices can share logs. This is not recommended for heavily used filesystems.
4. Absolute minimum log size is 1 MB. Good average is 1 MB per 100 MB. Recommended minimum is 1 MB per 1 GB.
5. All logs should be mirrored to avoid filesystem problems and/or data loss.
Filesystem notes:
1. Create new filesystems with "newfs -i 8192" -- 1 inode per 8K (default is 1 inode per 2K).
2. For large metadevices (>8G), increase the size of a cylinder group (max is 256) -- "newfs -c 256".
The cluster size should be equial to an integral of the stripe width:
maxcontig = 16 (16*8 Kbyte blocks = 128 Kbyte clusters)
Using a four-way stripe with 32K interlace results in 128K stripe width, which is good in this case.
interlace size = 32K(32K stripe unit size * 4 disks = 128K stripe width)
[color="#ff4400"]General section ends here.
State database replica notes:
1. All replicas are written when the configuration changes.
2. Only two replicas (per mirror) are updated for mirror dirty region bitmaps.
3. A good average is two replicas per three mirrors.
4. Use two replicas per one mirror for write intensive applications.
5. Use two replicas per 10 mirrors for read intensive applications.
6. 1 drive => 3 replicas on one slice, as the minimum number of replicas is 3.
7. 2-4 drives => 2 replicas on each drive.
8. 5+ drives => 1 replica on each drive.
9. Each state database replica occupies 517K (1034 disk sectors).
10. Replicas can be stored on a dedicated slice or on one that will be used in a metadevice.
11. The system will run with half of the replicas and will reboot with half+1.
CREATING DISKSUITE OBJECTS:
Create additional state database replicas:
- adding a state database replica:
metadb -a c0t2d0s0 = d11
metainit d10 -m d11
metaroot d10  no report, unless it is used as a submirror.
- everything else:
metastat (for a hot spare pool, it is "metastat hsp001")
Checking the status of a diskset:
- specified:
metaset -s relo-red
- all:
metaset
Recreate a stripe/concatenation after slice failure:
- concatenation has a failed slice => error on console: "WARNING: md d35: read error on /dev/dsk/c0t0d0s6":
umount /news
init 0
...
boot -s
...
ufsdump 0ucf /dev/rmt/0 /news
metaclear d35
metainit d35 2 1 c1t0d0s2 1 c1t0d1s2
newfs /dev/md/rdsk/d35
mount /dev/md/dsk/d35 /news
cd /news
ufsrestore rvf /dev/rmt 0
rm restoresymtable
ls /news
Note: If stripe => new slice = failed slice; If concatenation => new slice >= failed slice.
Enable a slice in a submirror:
- mirror d11 contains a slice c1t4d0s7 which had an error but is now ready to be enabled again:
metareplace -e d11 c1t4d0s7
Replace a slice in a submirror:
- mirror d6 has a submirror d26, with a slice c0t2d0s2 in the "Needs maintenance" mode.
metastat d6  this also deletes the diskset
*** Read "Tips and Tricks" from the Solstice DiskSuite 4.2.1 Collection
-----
DiskSuite stripe maintenance writup -- NOT TESTED!!!
*** Replace a submirror in a mirror (because of a failed slice):
- mirror d20 has a submirror d22 in the "Needs maintenance state". The submirror will be recreated from new slices:
metastat d20


本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/10290/showart_48696.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP