论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2005-06-23 10:54 |只看该作者 |倒序浏览

RAIDCTL(8)             OpenBSD System Manager's Manual          RAIDCTL(8)
NAME
   raidctl - configuration utility for the RAIDframe disk driver
SYNOPSIS
   raidctl [-v] [-afFgrR component] [-BGipPsSu] [-cC config_file]
         [-A [yes | no | root]] [-I serial_number] dev
DESCRIPTION
   raidctl is the user-land control program for
raid(4)
, the RAIDframe disk
   device.  raidctl is primarily used to dynamically configure and unconfig-
   ure RAIDframe disk devices.  For more information about the RAIDframe
   disk device, see
raid(4)
.
   This document assumes the reader has at least rudimentary knowledge of
   RAID and RAID concepts.
   The device used by raidctl is specified by dev.  dev may be either the
   full name of the device, e.g. /dev/rraid0c, or just simply raid0 (for
   /dev/rraid0c).
   For several commands (-BGipPsSu), raidctl can accept the word all as the
   dev argument.  If all is used, raidctl will execute the requested action
   for all the configured
raid(4)
devices.
   The command-line options for raidctl are as follows:
   -a component dev
         Add component as a hot spare for the device dev.
   -A yes dev
         Make the RAID set auto-configurable.  The RAID set will be auto-
         matically configured at boot before the root file system is
         mounted.  Note that all components of the set must be of type
         RAID in the disklabel.
   -A no dev
         Turn off auto-configuration for the RAID set.
   -A root dev
         Make the RAID set auto-configurable, and also mark the set as be-
         ing eligible to contain the root partition.  A RAID set config-
         ured this way will override the use of the boot disk as the root
         device.  All components of the set must be of type RAID in the
         disklabel.  Note that the kernel being booted must currently re-
         side on a non-RAID set and, in order to have the root file system
         correctly mounted from it, the RAID set must have its `a' parti-
         tion (aka raid[0..n]a) set up.
   -B dev  Initiate a copyback of reconstructed data from a spare disk to
         its original disk.  This is performed after a component has
         failed, and the failed drive has been reconstructed onto a spare
         drive.
   -c config_file dev
         Configure the RAIDframe device dev according to the configuration
         given in config_file.  A description of the contents of
         config_file is given later.
   -C config_file dev
         As for -c, but forces the configuration to take place.  This is
         required the first time a RAID set is configured.
   -f component dev
         This marks the specified component as having failed, but does not
         initiate a reconstruction of that component.
   -F component dev
         Fails the specified component of the device, and immediately be-
         gin a reconstruction of the failed disk onto an available hot
         spare.  This is one of the mechanisms used to start the recon-
         struction process if a component does have a hardware failure.
   -g component dev
         Get the component label for the specified component.
   -G dev  Generate the configuration of the RAIDframe device in a format
         suitable for use with raidctl -c or -C.
   -i dev  Initialize the RAID device.  In particular, (re-write) the parity
         on the selected device.  This MUST be done for all RAID sets be-
         fore the RAID device is labeled and before file systems are cre-
         ated on the RAID device.
   -I serial_number dev
         Initialize the component labels on each component of the device.
         serial_number is used as one of the keys in determining whether a
         particular set of components belong to the same RAID set.  While
         not strictly enforced, different serial numbers should be used
         for different RAID sets.  This step MUST be performed when a new
         RAID set is created.
   -p dev  Check the status of the parity on the RAID set.  Displays a sta-
         tus message, and returns successfully if the parity is up-to-
         date.
   -P dev  Check the status of the parity on the RAID set, and initialize
         (re-write) the parity if the parity is not known to be up-to-
         date.  This is normally used after a system crash (and before a

fsck(8)
) to ensure the integrity of the parity.
   -r component dev
         Remove the spare disk specified by component from the set of
         available spare components.
   -R component dev
         Fails the specified component, if necessary, and immediately be-
         gins a reconstruction back to component.  This is useful for re-
         constructing back onto a component after it has been replaced
         following a failure.
   -s dev  Display the status of the RAIDframe device for each of the compo-
         nents and spares.
   -S dev  Check the status of parity re-writing, component reconstruction,
         and component copyback.  The output indicates the amount of
         progress achieved in each of these areas.
   -u dev  Unconfigure the RAIDframe device.
   -v    Be more verbose.  For operations such as reconstructions, parity
         re-writing, and copybacks, provide a progress indicator.
Configuration file
   The format of the configuration file is complex, and only an abbreviated
   treatment is given here.  In the configuration files, a `#' indicates the
   beginning of a comment.
   There are 4 required sections of a configuration file, and 2 optional
   sections.  Each section begins with a `START', followed by the section
   name, and the configuration parameters associated with that section.  The
   first section is the `array' section, and it specifies the number of
   rows, columns, and spare disks in the RAID set.  For example:
         START array
         1 3 0
   indicates an array with 1 row, 3 columns, and 0 spare disks.  Note that
   although multi-dimensional arrays may be specified, they are NOT support-
   ed in the driver.
   The second section, the `disks' section, specifies the actual components
   of the device.  For example:
         START disks
         /dev/sd0e
         /dev/sd1e
         /dev/sd2e
   specifies the three component disks to be used in the RAID device.  If
   any of the specified drives cannot be found when the RAID device is con-
   figured, then they will be marked as `failed', and the system will oper-
   ate in degraded mode.  Note that it is imperative that the order of the
   components in the configuration file does not change between configura-
   tions of a RAID device.  Changing the order of the components will result
   in data loss if the set is configured with the -C option.  In normal cir-
   cumstances, the RAID set will not configure if only -c is specified, and
   the components are out-of-order.
   The next section, which is the `spare' section, is optional, and, if pre-
   sent, specifies the devices to be used as `hot spares' -- devices which
   are on-line, but are not actively used by the RAID driver unless one of
   the main components fail.  A simple `spare' section might be:
         START spare
         /dev/sd3e
   for a configuration with a single spare component.  If no spare drives
   are to be used in the configuration, then the `spare' section may be
   omitted.
   The next section is the `layout' section.  This section describes the
   general layout parameters for the RAID device, and provides such informa-
   tion as sectors per stripe unit, stripe units per parity unit, stripe
   units per reconstruction unit, and the parity configuration to use.  This
   section might look like:
         START layout
         # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level
         32 1 1 5
   The sectors per stripe unit specifies, in blocks, the interleave factor;
   i.e. the number of contiguous sectors to be written to each component for
   a single stripe.  Appropriate selection of this value (32 in this exam-
   ple) is the subject of much research in RAID architectures.  The stripe
   units per parity unit and stripe units per reconstruction unit are nor-
   mally each set to 1.  While certain values above 1 are permitted, a dis-
   cussion of valid values and the consequences of using anything other than
   1 are outside the scope of this document.  The last value in this section
   (5 in this example) indicates the parity configuration desired.  Valid
   entries include:
   0    RAID level 0.  No parity, only simple striping.
   1    RAID level 1.  Mirroring.  The parity is the mirror.
   4    RAID level 4.  Striping across components, with parity stored on
         the last component.
   5    RAID level 5.  Striping across components, parity distributed
         across all components.
   There are other valid entries here, including those for Even-Odd parity,
   RAID level 5 with rotated sparing, Chained declustering, and Interleaved
   declustering, but as of this writing the code for those parity operations
   has not been tested with OpenBSD.
   The next required section is the `queue' section.  This is most often
   specified as:
         START queue
         fifo 100
   where the queuing method is specified as FIFO (First-In, First-Out), and
   the size of the per-component queue is limited to 100 requests.  Other
   queuing methods may also be specified, but a discussion of them is beyond
   the scope of this document.
   The final section, the `debug' section, is optional.  For more details on
   this the reader is referred to the RAIDframe documentation discussed in
   the HISTORY section.  See EXAMPLES for a more complete configuration file
   example.
EXAMPLES
   It is highly recommended that before using the RAID driver for real file
   systems that the system administrator(s) become quite familiar with the
   use of raidctl, and that they understand how the component reconstruction
   process works.  The examples in this section will focus on configuring a
   number of different RAID sets of varying degrees of redundancy.  By work-
   ing through these examples, administrators should be able to develop a
   good feel for how to configure a RAID set, and how to initiate recon-
   struction of failed components.
   In the following examples `raid0' will be used to denote the RAID device.
   `/dev/rraid0c' may be used in place of `raid0'.
Initialization and Configuration
   The initial step in configuring a RAID set is to identify the components
   that will be used in the RAID set.  All components should be the same
   size.  Each component should have a disklabel type of FS_RAID, and a typ-
   ical disklabel entry for a RAID component might look like:
         f:  1800000  200495    RAID             # (Cyl.  405*- 4041*)
   While FS_BSDFFS (e.g. 4.2BSD) will also work as the component type, the
   type FS_RAID (e.g. RAID) is preferred for RAIDframe use, as it is re-
   quired for features such as auto-configuration.  As part of the initial
   configuration of each RAID set, each component will be given a `component
   label'.  A `component label' contains important information about the
   component, including a user-specified serial number, the row and column
   of that component in the RAID set, the redundancy level of the RAID set,
   a 'modification counter', and whether the parity information (if any) on
   that component is known to be correct.  Component labels are an integral
   part of the RAID set, since they are used to ensure that components are
   configured in the correct order, and used to keep track of other vital
   information about the RAID set.  Component labels are also required for
   the auto-detection and auto-configuration of RAID sets at boot time.  For
   a component label to be considered valid, that particular component label
   must be in agreement with the other component labels in the set.  For ex-
   ample, the serial number, `modification counter', number of rows and num-
   ber of columns must all be in agreement.  If any of these are different,
   then the component is not considered to be part of the set.  See
raid(4)
   for more information about component labels.
   Once the components have been identified, and the disks have appropriate
   labels, raidctl is then used to configure the
raid(4)
device.  To config-
   ure the device, a configuration file which looks something like:
         START array
         # numRow numCol numSpare
         1 3 1
         START disks
         /dev/sd1e
         /dev/sd2e
         /dev/sd3e
         START spare
         /dev/sd4e
         START layout
         # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
         32 1 1 5
         START queue
         fifo 100
   is created in a file.  The above configuration file specifies a RAID 5
   set consisting of the components /dev/sd1e, /dev/sd2e, and /dev/sd3e,
   with /dev/sd4e available as a `hot spare' in case one of the three main
   drives should fail.  A RAID 0 set would be specified in a similar way:
         START array
         # numRow numCol numSpare
         1 4 0
         START disks
         /dev/sd10e
         /dev/sd11e
         /dev/sd12e
         /dev/sd13e
         START layout
         # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_0
         64 1 1 0
         START queue
         fifo 100
   In this case, devices /dev/sd10e, /dev/sd11e, /dev/sd12e, and /dev/sd13e
   are the components that make up this RAID set.  Note that there are no
   hot spares for a RAID 0 set, since there is no way to recover data if any
   of the components fail.
   For a RAID 1 (mirror) set, the following configuration might be used:
         START array
         # numRow numCol numSpare
         1 2 0
         START disks
         /dev/sd20e
         /dev/sd21e
         START layout
         # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
         128 1 1 1
         START queue
         fifo 100
   In this case, /dev/sd20e and /dev/sd21e are the two components of the
   mirror set.  While no hot spares have been specified in this configura-
   tion, they easily could be, just as they were specified in the RAID 5
   case above.  Note as well that RAID 1 sets are currently limited to only
   2 components.  At present, n-way mirroring is not possible.
   The first time a RAID set is configured, the -C option must be used:
         # raidctl -C raid0.conf raid0
   where `raid0.conf' is the name of the RAID configuration file.  The -C
   forces the configuration to succeed, even if any of the component labels
   are incorrect.  The -C option should not be used lightly in situations
   other than initial configurations, as if the system is refusing to con-
   figure a RAID set, there is probably a very good reason for it.  After
   the initial configuration is done (and appropriate component labels are
   added with the -I option) then raid0 can be configured normally with:
         # raidctl -c raid0.conf raid0
   When the RAID set is configured for the first time, it is necessary to
   initialize the component labels, and to initialize the parity on the RAID
   set.  Initializing the component labels is done with:
         # raidctl -I 112341 raid0
   where `112341' is a user-specified serial number for the RAID set.  This
   initialization step is required for all RAID sets.  Also, using different
   serial numbers between RAID sets is strongly encouraged, as using the
   same serial number for all RAID sets will only serve to decrease the use-
   fulness of the component label checking.
   Initializing the RAID set is done via the -i option.  This initialization
   MUST be done for all RAID sets, since among other things it verifies that
   the parity (if any) on the RAID set is correct.  Since this initializa-
   tion may be quite time-consuming, the -v option may be also used in con-
   junction with -i:
         # raidctl -iv raid0
   This will give more verbose output on the status of the initialization:
         Initiating re-write of parity
         Parity Re-write status:
         10% |****                                  | ETA: 06:03 /
   The output provides a `Percent Complete' in both a numeric and graphical
   format, as well as an estimated time to completion of the operation.
   Since it is the parity that provides the `redundancy' part of RAID, it is
   critical that the parity is correct as much as possible.  If the parity
   is not correct, then there is no guarantee that data will not be lost if
   a component fails.
   Once the parity is known to be correct, it is then safe to perform

disklabel(8)
,
newfs(8)
, or
fsck(8)
on the device or its filesystems, and
   then to mount the filesystems for use.
   Under certain circumstances (e.g. the additional component has not ar-
   rived, or data is being migrated off of a disk destined to become a com-
   ponent) it may be desirable to configure a RAID 1 set with only a single
   component.  This can be achieved by configuring the set with a physically
   existing component (as either the first or second component) and with a
   `fake' component.  In the following:
         START array
         # numRow numCol numSpare
         1 2 0
         START disks
         /dev/sd6e
         /dev/sd0e
         START layout
         # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
         128 1 1 1
         START queue
         fifo 100
   /dev/sd0e is the real component, and will be the second disk of a RAID 1
   set.  The component /dev/sd6e, which must exist, but have no physical de-
   vice associated with it, is simply used as a placeholder.  Configuration
   (using -C and -I 12345 as above) proceeds normally, but initialization of
   the RAID set will have to wait until all physical components are present.
   After configuration, this set can be used normally, but will be operating
   in degraded mode.  Once a second physical component is obtained, it can
   be hot-added, the existing data mirrored, and normal operation resumed.
Maintenance of the RAID set
   After the parity has been initialized for the first time, the command:
         # raidctl -p raid0
   can be used to check the current status of the parity.  To check the par-
   ity and rebuild it necessary (for example, after an unclean shutdown) the
   command:
         # raidctl -P raid0
   is used.  Note that re-writing the parity can be done while other opera-
   tions on the RAID set are taking place (e.g. while doing an
fsck(8)
on a
   file system on the RAID set).  However: for maximum effectiveness of the
   RAID set, the parity should be known to be correct before any data on the
   set is modified.
   To see how the RAID set is doing, the following command can be used to
   show the RAID set's status:
         # raidctl -s raid0
   The output will look something like:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: optimal
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: spare
         Parity status: clean
         Reconstruction is 100% complete.
         Parity Re-write is 100% complete.
         Copyback is 100% complete.
   This indicates that all is well with the RAID set.  Of importance here
   are the component lines which read `optimal', and the `Parity status'
   line which indicates that the parity is up-to-date.  Note that if there
   are file systems open on the RAID set, the individual components will not
   be `clean' but the set as a whole can still be clean.
   The -v option may be also used in conjunction with -s:
         # raidctl -sv raid0
   In this case, the components' label information (see the -g option) will
   be given as well:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: optimal
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: spare
         Component label for /dev/sd1e:
            Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
            Version: 2 Serial Number: 13432 Mod Counter: 65
            Clean: No Status: 0
            sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
            RAID Level: 5  blocksize: 512 numBlocks: 1799936
            Autoconfig: No
            Last configured as: raid0
         Component label for /dev/sd2e:
            Row: 0 Column: 1 Num Rows: 1 Num Columns: 3
            Version: 2 Serial Number: 13432 Mod Counter: 65
            Clean: No Status: 0
            sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
            RAID Level: 5  blocksize: 512 numBlocks: 1799936
            Autoconfig: No
            Last configured as: raid0
         Component label for /dev/sd3e:
            Row: 0 Column: 2 Num Rows: 1 Num Columns: 3
            Version: 2 Serial Number: 13432 Mod Counter: 65
            Clean: No Status: 0
            sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
            RAID Level: 5  blocksize: 512 numBlocks: 1799936
            Autoconfig: No
            Last configured as: raid0
         Parity status: clean
         Reconstruction is 100% complete.
         Parity Re-write is 100% complete.
         Copyback is 100% complete.
   To check the component label of /dev/sd1e, the following is used:
         # raidctl -g /dev/sd1e raid0
   The output of this command will look something like:
         Component label for /dev/sd1e:
            Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
            Version: 2 Serial Number: 13432 Mod Counter: 65
            Clean: No Status: 0
            sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
            RAID Level: 5  blocksize: 512 numBlocks: 1799936
            Autoconfig: No
            Last configured as: raid0
Dealing with Component Failures
   If for some reason (perhaps to test reconstruction) it is necessary to
   pretend a drive has failed, the following will perform that function:
         # raidctl -f /dev/sd2e raid0
   The system will then be performing all operations in degraded mode, where
   missing data is re-computed from existing data and the parity.  In this
   case, obtaining the status of raid0 will return (in part):
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: failed
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: spare
   Note that with the use of -f a reconstruction has not been started.  To
   both fail the disk and start a reconstruction, the -F option must be
   used:
         # raidctl -F /dev/sd2e raid0
   The -f option may be used first, and then the -F option used later, on
   the same disk, if desired.  Immediately after the reconstruction is
   started, the status will report:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: reconstructing
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: used_spare
         [...]
         Parity status: clean
         Reconstruction is 10% complete.
         Parity Re-write is 100% complete.
         Copyback is 100% complete.
   This indicates that a reconstruction is in progress.  To find out how the
   reconstruction is progressing the -S option may be used.  This will indi-
   cate the progress in terms of the percentage of the reconstruction that
   is completed.  When the reconstruction is finished the -s option will
   show:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: spared
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: used_spare
         [...]
         Parity status: clean
         Reconstruction is 100% complete.
         Parity Re-write is 100% complete.
         Copyback is 100% complete.
   At this point there are at least two options.  First, if /dev/sd2e is
   known to be good (i.e. the failure was either caused by -f or -F, or the
   failed disk was replaced), then a copyback of the data can be initiated
   with the -B option.  In this example, this would copy the entire contents
   of /dev/sd4e to /dev/sd2e.  Once the copyback procedure is complete, the
   status of the device would be (in part):
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: optimal
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: spare
   and the system is back to normal operation.
   The second option after the reconstruction is to simply use /dev/sd4e in
   place of /dev/sd2e in the configuration file.  For example, the configu-
   ration file (in part) might now look like:
         START array
         1 3 0
         START drives
         /dev/sd1e
         /dev/sd4e
         /dev/sd3e
   This can be done as /dev/sd4e is completely interchangeable with
   /dev/sd2e at this point.  Note that extreme care must be taken when
   changing the order of the drives in a configuration.  This is one of the
   few instances where the devices and/or their orderings can be changed
   without loss of data!  In general, the ordering of components in a con-
   figuration file should never be changed.
   If a component fails and there are no hot spares available on-line, the
   status of the RAID set might (in part) look like:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: failed
                  /dev/sd3e: optimal
         No spares.
   In this case there are a number of options.  The first option is to add a
   hot spare using:
         # raidctl -a /dev/sd4e raid0
   After the hot add, the status would then be:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: failed
                  /dev/sd3e: optimal
         Spares:
                  /dev/sd4e: spare
   Reconstruction could then take place using -F as describe above.
   A second option is to rebuild directly onto /dev/sd2e.  Once the disk
   containing /dev/sd2e has been replaced, one can simply use:
         # raidctl -R /dev/sd2e raid0
   to rebuild the /dev/sd2e component.  As the rebuilding is in progress,
   the status will be:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: reconstructing
                  /dev/sd3e: optimal
         No spares.
   and when completed, will be:
         Components:
                  /dev/sd1e: optimal
                  /dev/sd2e: optimal
                  /dev/sd3e: optimal
         No spares.
   In circumstances where a particular component is completely unavailable
   after a reboot, a special component name will be used to indicate the
   missing component.  For example:
         Components:
                  /dev/sd2e: optimal
                  component1: failed
         No spares.
   indicates that the second component of this RAID set was not detected at
   all by the auto-configuration code.  The name `component1' can be used
   anywhere a normal component name would be used.  For example, to add a
   hot spare to the above set, and rebuild to that hot spare, the following
   could be done:
         # raidctl -a /dev/sd3e raid0
         # raidctl -F component1 raid0
   at which point the data missing from `component1' would be reconstructed
   onto /dev/sd3e.
RAID on RAID
   RAID sets can be layered to create more complex and much larger RAID
   sets.  A RAID 0 set, for example, could be constructed from four RAID 5
   sets.  The following configuration file shows such a setup:
         START array
         # numRow numCol numSpare
         1 4 0
         START disks
         /dev/raid1e
         /dev/raid2e
         /dev/raid3e
         /dev/raid4e
         START layout
         # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_0
         128 1 1 0
         START queue
         fifo 100
   A similar configuration file might be used for a RAID 0 set constructed
   from components on RAID 1 sets.  In such a configuration, the mirroring
   provides a high degree of redundancy, while the striping provides addi-
   tional speed benefits.
Auto-configuration and Root on RAID
   RAID sets can also be auto-configured at boot.  To make a set auto-con-
   figurable, simply prepare the RAID set as above, and then do a:
         # raidctl -A yes raid0
   to turn on auto-configuration for that set.  To turn off auto-configura-
   tion, use:
         # raidctl -A no raid0
   RAID sets which are auto-configurable will be configured before the root
   file system is mounted.  These RAID sets are thus available for use as a
   root file system, or for any other file system.  A primary advantage of
   using the auto-configuration is that RAID components become more indepen-
   dent of the disks they reside on.  For example, SCSI ID's can change, but
   auto-configured sets will always be configured correctly, even if the SC-
   SI ID's of the component disks have become scrambled.
   Having a system's root file system (/) on a RAID set is also allowed,
   with the `a' partition of such a RAID set being used for /.  To use
   raid0a as the root file system, simply use:
         # raidctl -A root raid0
   To return raid0 to be just an auto-configuring set simply use the -A yes
   arguments.
   Note that kernels can't be directly read from a RAID component.  To sup-
   port the root file system on RAID sets, some mechanism must be used to
   get a kernel booting.  For example, a small partition containing only the
   secondary boot-blocks and an alternate kernel (or two) could be used.
   Once a kernel is booting however, and an auto-configured RAID set is
   found that is eligible to be root, then that RAID set will be auto-con-
   figured and its `a' partition (aka raid[0..n]a) will be used as the root
   file system.  If two or more RAID sets claim to be root devices, then the
   user will be prompted to select the root device.  At this time, RAID 0,
   1, 4, and 5 sets are all supported as root devices.
   A typical RAID 1 setup with root on RAID might be as follows:
   1. wd0a - a small partition, which contains a complete, bootable, basic
      OpenBSD installation.
   2. wd1a - also contains a complete, bootable, basic OpenBSD installa-
      tion.
   3. wd0e and wd1e - a RAID 1 set, raid0, used for the root file system.
   4. wd0f and wd1f - a RAID 1 set, raid1, which will be used only for
      swap space.
   5. wd0g and wd1g - a RAID 1 set, raid2, used for /usr, /home, or other
      data, if desired.
   6. wd0h and wd1h - a RAID 1 set, raid3, if desired.
   RAID sets raid0, raid1, and raid2 are all marked as auto-configurable.
   raid0 is marked as being a root-able raid.  When new kernels are in-
   stalled, the kernel is not only copied to /, but also to wd0a and wd1a.
   The kernel on wd0a is required, since that is the kernel the system boots
   from.  The kernel on wd1a is also required, since that will be the kernel
   used should wd0 fail.  The important point here is to have redundant
   copies of the kernel available, in the event that one of the drives fail.
   There is no requirement that the root file system be on the same disk as
   the kernel.  For example, obtaining the kernel from wd0a, and using sd0e
   and sd1e for raid0, and the root file system, is fine.  It is critical,
   however, that there be multiple kernels available, in the event of media
   failure.
   Multi-layered RAID devices (such as a RAID 0 set made up of RAID 1 sets)
   are not supported as root devices or auto-configurable devices at this
   point.  (Multi-layered RAID devices are supported in general, however, as
   mentioned earlier.)  Note that in order to enable component auto-detec-
   tion and auto-configuration of RAID devices, the line:
         option RAID_AUTOCONFIG
   must be in the kernel configuration file.  See
raid(4)
for more details.
Unconfiguration
   The final operation performed by raidctl is to unconfigure a
raid(4)
de-
   vice.  This is accomplished via a simple:
         # raidctl -u raid0
   at which point the device is ready to be reconfigured.
Performance Tuning
   Selection of the various parameter values which result in the best per-
   formance can be quite tricky, and often requires a bit of trial-and-error
   to get those values most appropriate for a given system.  A whole range
   of factors come into play, including:
   1. Types of components (e.g. SCSI vs. IDE) and their bandwidth
   2. Types of controller cards and their bandwidth
   3. Distribution of components among controllers
   4. IO bandwidth
   5. File system access patterns
   6. CPU speed
   As with most performance tuning, benchmarking under real-life loads may
   be the only way to measure expected performance.  Understanding some of
   the underlying technology is also useful in tuning.  The goal of this
   section is to provide pointers to those parameters which may make signif-
   icant differences in performance.
   For a RAID 1 set, a SectPerSU value of 64 or 128 is typically sufficient.
   Since data in a RAID 1 set is arranged in a linear fashion on each compo-
   nent, selecting an appropriate stripe size is somewhat less critical than
   it is for a RAID 5 set.  However: a stripe size that is too small will
   cause large IO's to be broken up into a number of smaller ones, hurting
   performance.  At the same time, a large stripe size may cause problems
   with concurrent accesses to stripes, which may also affect performance.
   Thus values in the range of 32 to 128 are often the most effective.
   Tuning RAID 5 sets is trickier.  In the best case, IO is presented to the
   RAID set one stripe at a time.  Since the entire stripe is available at
   the beginning of the IO, the parity of that stripe can be calculated be-
   fore the stripe is written, and then the stripe data and parity can be
   written in parallel.  When the amount of data being written is less than
   a full stripe worth, the `small write' problem occurs.  Since a `small
   write' means only a portion of the stripe on the components is going to
   change, the data (and parity) on the components must be updated slightly
   differently.  First, the `old parity' and `old data' must be read from
   the components.  Then the new parity is constructed, using the new data
   to be written, and the old data and old parity.  Finally, the new data
   and new parity are written.  All this extra data shuffling results in a
   serious loss of performance, and is typically 2 to 4 times slower than a
   full stripe write (or read).  To combat this problem in the real world,
   it may be useful to ensure that stripe sizes are small enough that a
   `large IO' from the system will use exactly one large stripe write.  As
   is seen later, there are some file system dependencies which may come in-
   to play here as well.
   Since the size of a `large IO' is often (currently) only 32K or 64K, on a
   5-drive RAID 5 set it may be desirable to select a SectPerSU value of 16

blocks (8K)
or 32 blocks (16K).  Since there are 4 data sectors per
   stripe, the maximum data per stripe is 64 blocks (32K) or 128 blocks
   (64K).  Again, empirical measurement will provide the best indicators of
   which values will yield better performance.
   The parameters used for the file system are also critical to good perfor-
   mance.  For
newfs(8)
, for example, increasing the block size to 32K or
   64K may improve performance dramatically.  Also, changing the cylinders-
   per-group parameter from 16 to 32 or higher is often not only necessary
   for larger file systems, but may also have positive performance implica-
   tions.
Summary
   Despite the length of this man-page, configuring a RAID set is a rela-
   tively straight-forward process.  All that needs to be done is the fol-
   lowing steps:
   1. Use
disklabel(8)
to create the components (of type RAID).
   2. Construct a RAID configuration file: e.g. `raid0.conf'
   3. Configure the RAID set with:
            # raidctl -C raid0.conf raid0
   4. Initialize the component labels with:
            # raidctl -I 123456 raid0
   5. Initialize other important parts of the set with:
            # raidctl -i raid0
   6. Get the default label for the RAID set:
            # disklabel raid0 > /tmp/label
   7. Edit the label:
            # vi /tmp/label
   8. Put the new label on the RAID set:
            # disklabel -R -r raid0 /tmp/label
   9. Create the file system:
            # newfs /dev/rraid0e
   10.  Mount the file system:
            # mount /dev/raid0e /mnt
   11.  Use:
            # raidctl -c raid0.conf raid0
      to re-configure the RAID set the next time it is needed, or put
      raid0.conf into /etc where it will automatically be started by the
      /etc/rc scripts.
WARNINGS
   Certain RAID levels (1, 4, 5, 6, and others) can protect against some da-
   ta loss due to component failure.  However the loss of two components of
   a RAID 4 or 5 system, or the loss of a single component of a RAID 0 sys-
   tem will result in the entire filesystem being lost.  RAID is NOT a sub-
   stitute for good backup practices.
   Recomputation of parity MUST be performed whenever there is a chance that
   it may have been compromised.  This includes after system crashes, or be-
   fore a RAID device has been used for the first time.  Failure to keep
   parity correct will be catastrophic should a component ever fail -- it is
   better to use RAID 0 and get the additional space and speed, than it is
   to use parity, but not keep the parity correct.  At least with RAID 0
   there is no perception of increased data security.
FILES
   /dev/{,r}raid*  raid device special files.
SEE ALSO

ccd(4)
,
raid(4)
,
rc(8)
HISTORY
   RAIDframe is a framework for rapid prototyping of RAID structures devel-
   oped by the folks at the Parallel Data Laboratory at Carnegie Mellon Uni-
   versity (CMU).  A more complete description of the internals and func-
   tionality of RAIDframe is found in the paper "RAIDframe: A Rapid Proto-
   typing Tool for RAID Systems", by William V. Courtright II, Garth Gibson,
   Mark Holland, LeAnn Neal Reilly, and Jim Zelenka, and published by the
   Parallel Data Laboratory of Carnegie Mellon University.
   The raidctl command first appeared as a program in CMU's RAIDframe v1.1
   distribution.  This version of raidctl is a complete re-write, and first
   appeared in NetBSD 1.4 from where it was ported to OpenBSD 2.5.
BUGS
   Hot-spare removal is currently not available.
COPYRIGHT
   The RAIDframe Copyright is as follows:
   Copyright (c) 1994-1996 Carnegie-Mellon University.
   All rights reserved.
   Permission to use, copy, modify and distribute this software and
   its documentation is hereby granted, provided that both the copyright
   notice and this permission notice appear in all copies of the
   software, derivative works or modified versions, and any portions
   thereof, and that both notices appear in supporting documentation.
   CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
   CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
   FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
   Carnegie Mellon requests users of this software to return to
   Software Distribution Coordinator  or
Software.Distribution@CS.CMU.EDU
   School of Computer Science
   Carnegie Mellon University
   Pittsburgh PA 15213-3890
   any improvements or extensions that they make and grant Carnegie the
   rights to redistribute these changes.
OpenBSD 3.7                   July 10, 2001                            15

本文来自ChinaUnix博客，如果查看原文请点：http://blog.chinaunix.net/u/2389/showart_32469.html

文库|博客

返回列表

Chinaunix › 论坛 › 操作系统 › BSD › BSD文档中心 › configuration utility for the RAIDframe disk drive

configuration utility for the RAIDframe disk drive [复制链接]