免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 5443 | 回复: 0
打印 上一主题 下一主题

[备份软件] STK ACSLS and Veritas NetBackup media server [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2006-12-03 13:09 |只看该作者 |倒序浏览

TABLE OF CONTENTS
1.  LOGGING
    1.1  LOG LOCATIONS
          1.1.1 Event Logs
    1.2  ACS PROCESS TRACING
          1.2.1  ACSSSI Tracing on the  VERITAS NetBackup ™ media server
          1.2.2  ACSSS Tracing on the Library Server
2  ACS LIBRARY SERVER (ACSLS) FUNCTIONS AND COMMANDS
    2.1  ACCESS CONTROL ON THE ACS LIBRARY SERVER (ACSLS)
    2.2  ACSSA COMMANDS ON THE ACS LIBRARY SERVER
          2.2.1  Log on to the ACSLS Server
          2.2.2  Query the Library Management Unit
          2.2.3  Query the Cartridge Access Ports
          2.2.4  Query silos (Library Storage Modules)
          2.2.5  Query Drives
          2.2.6  Query Volumes
          2.2.7  Command to Start Request Processing
          2.2.8  Vary on LSM
          2.2.9  Logoff from ACSSA (ACSLS Server interface)
   2.3  ACSLS TAPE CLEANING
3  DEVICE CONFIGURATION FOR ACSLS CONTROLLED TAPE DRIVES
   3.1  DEVICE CONFIGURATION
          3.1.1  SSO Device Configuration for ACSLS
          3.1.2  NON-SSO Device Configuration for ACSLS
          3.1.3  Initial configuration of StorageTek T9940A and T9940B tape drives in an ACSLS environment
4  ROBTEST FOR ACS LIBRARIES
   4.1  INVOKING ROBTEST
   4.2  ROBTEST SYNTAX
          4.2.1  Command to Obtain Drive Status
          4.2.2  Command to Query Volumes
          4.2.3  Command to Mount a Volume
          4.2.4  Command to Dismount a Volume
   4.3  HOW TO DEFINE ACSLS SCRATCH POOLS AND ADD VOLUMES USING ROBTEST  
5  MEDIA
   5.1  AVAILABLE MEDIA SCRIPT
   5.2  HOW TO SPECIFY WHICH MEDIA ACCESS PORT (MAP) TO USE FOR TAPE EJECTION
6  COMMUNICATION
   6.1  REMOTE PROCEDURE CALL
          6.1.1  How to Start RPC on Different Operating Systems
          6.1.2  How to Verify that RPC is Running
          6.1.3  How to Verify the ACSSS Program Registration
          6.1.4  Basic snoop Output
7  COMMON ACS ERROR MESSAGES
   7.1  ACS (2) UNAVAILABLE:  INITIALIZATION FAILED:  UNABLE TO INITIALIZE ROBOT
   7.2  ACS STATUS = 54, STATUS_IPC_FAILURE
   7.3  ACS STATUS = 72, STATUS_PENDING
   7.4  STATUS_NI_FAILURE
          7.4.1  ACS status = 104, STATUS_NI_FAILURE
          7.4.2  ACS status = 105, STATUS_NI_TIMEDOUT
1.  LOGGING
This section covers ACSLS log locations and ACS process tracing.

1.1  LOG LOCATIONS
1.1.1  Event Logs
Event log location on NetBackup media server: /usr/openv/volmgr/debug/acsssi/event.log
Event log location on ACS Library Server: /export/home/ACSSS/log/acsss_event.log
(There is also an install, configuration change, and statistics log in the same directory.)
Typical event log entries are cap operations, remote procedure call (RPC) and client initialization, robot errors, NI failures, and drive status changes.

1.2  ACS PROCESS TRACING
1.2.1  ACSSSI Tracing on the  VERITAS NetBackup media server
1.2.1.1  To turn on acsssi tracing on the NetBackup media server, send an SIGUSR1 signal to the acsssi process as follows:
a. Ensure this directory exists:
/usr/openv/volmgr/debug/acsssi
b. Find the acsssi PID by running the command:
/usr/openv/volmgr/bin/vmps
c. Toggle on acsssi tracing:
kill -USR1
(This will start a trace.log in the /usr/openv/volmgr/debug/acsssi directory.)
d. To turn off acsssi tracing:
kill -USR1
Tracing can be turned on or off multiple times using the same kill command.
NOTE: To read the trace.log, it is necessary to use the StorageTek trace_decode. Please contact StorageTek for a copy and instructions for use.
1.2.2 ACSSS Tracing on the Library Server
1.2.2.1 To turn on the Library Server ability to trace the ONC RPC session and capture packets exchanged between the media server (SSI) and Library Server system (CSI):
a. Run toggle with the option "on" located in /export/home/ACSSS/diag/bin:
#./toggle on
b. To turn on CSI tracing, simply send a SIGUSR1 signal to the CSI process. This can be accomplished by using the kill command as follows:
# kill -USR1 [CSI pid]
NOTE: It is necessary to first get the PID of the CSI using the ps command.
Once enabled, an acsssi_trace.log will be created in the /export/home/ACSSS/log directory. The log file contains a record of all packet activity between the media server and the ACSLS server. Each packet is displayed with a time stamp, the direction of the packet, the SSI client IP address, port, identifier, and a hex dump of the contents of the packet. To read the trace.log, it is necessary to use the StorageTek trace_decode. Please contact StorageTek for a copy and instructions for use.  The decoder from StorageTek will output each packet with the time stamp, the packet direction (to or from the ACSLS), the command type (i.e. start), the packet type (i.e. request, response), the number of bytes in the packet and the values of each of the fields in the CSI header and message header structures, plus any command specific parameters. For each field in the structure, the byte offset, size, and value (in hex and ASCII) are also given.
To turn off tracing, use the same method used to turn on tracing. Tracing can be toggled on/off multiple times using the same command.
For Windows:
Windows event logging is turned on using the mini_el and the ACSSEL function shipped with the ACS product. The packet trace is controlled using the toggle_trace script. Both tools are in the Program Files\StorageTek\LibAttach\bin directory.
NOTE: Do not leave tracing on indefinitely because it may fill disk space over time.
2.  ACS LIBRARY SERVER (ACSLS) FUNCTIONS AND COMMANDS
This section covers ACSLS functions and commands.

2.1   ACCESS CONTROL ON THE ACS LIBRARY SERVER (ACSLS)
Under ACSLS there exists the ability to control command and volume access. ACSLS uses a set of client identification files and a series of allow or disallow files to manage access control. These control files reside on the ACSLS library server in the $ACS_HOME/data/external/access_control directory. The internet.addresses file allows access control of hosts. ACSLS will compare this file against the user_id field in the received RPC request packet to determine whether to forward the packet on for further processing. A non-zero return code for this operation will result in a STATUS_INVALID_OPTION response being sent back to the media server.
Volume control can be done through the set owner command in the cmd_proc utility or in the file ownership.assignment. A 'STATUS_INVALID_OPTION' status will return to commands which are rejected due to access control violations.
Volume access control applies to the following commands:
dismount, lock, mount_readonly, set_clean, set_scratch, eject, mount, query_volume, set_owner, unlock
For further information on access control, please contact StorageTek.
2.2  ACSSA COMMANDS ON THE ACS LIBRARY SERVER
2.2.1  Log on to the ACSLS Server
# su - acsss
At the prompt, enter:
$ cmd_proc -ql
Wait for the ACSSA> prompt.
2.2.2  Query the Library Management Unit
ACSSA> q lmu all
2004-01-28 14:24:10                LMU Status
ACS:   0      Mode: SCSI LMU           Master Status:  Communicating
                                     Standby Status:  -
Port    Port State  Role        CL  Port Name
0, 0   online       -           -  /dev/mchanger2
2.2.3  Query the Cartridge Access Ports
ACSSA> q cap all
2004-01-28 14:25:30                CAP Status
Identifier   Priority  Size  State            Mode       Status
0, 0,0     0         10    online           automatic  available
2.2.4  Query silos (Library Storage Modules)
ACSSA> q lsm all
2004-01-28 14:26:22                LSM Status
Identifier   State           Free Cell  Audit  Mount  Dismount  Enter  Eject
                            Count      C/P    C/P    C/P       C/P    C/P
0, 0       online           36         0/0    0/0    0/0       0/0    0/0
2.2.5 Query Drives
ACSSA> q drive all
2004-01-28 14:27:34               Drive Status
Identifier   State           Status      Volume     Type
0, 0, 0, 0 online          available              DLT7000
0, 0, 0, 1 online          available              DLT7000
0, 0, 0, 2 online          available              9840
0, 0, 0, 3 online          available              9840
2.2.6  Query Volumes
ACSSA> q volume all
2004-01-28 15:58:36              Volume Status
Identifier  Status          Current Location        Type
000002      home              0, 0, 1, 0, 0         STK1R
000003      home              0, 0, 0, 2, 0         STK1R
000004      home              0, 0, 0, 3, 0         STK1R
000005      home              0, 0, 1, 5, 0         STK1R
000006      home              0, 0, 1, 8, 1         STK1R
000008      home              0, 0, 0,23, 0         STK1R
000009      home              0, 0, 1,12, 1         STK1R
2004-01-28 15:58:37              Volume Status
Identifier  Status          Current Location        Type
000047      home              0, 0, 1, 9, 0         STK1R
000048      home              0, 0, 1,10, 1         STK1R
000049      home              0, 0, 0, 7, 0         STK1R
000050      home              0, 0, 0, 4, 0         STK1R
FX0023      home              0, 0, 0, 0, 0         SDLT
2.2.7  Command to Start Request Processing
ACSSA> start
Start: ACSLM Request Processing Started: Success.
2.2.8 Vary on LSM
ACSSA> vary lsm
LSM identifier (acs,lsm): 0,0
LSM identifier (acs/lsm):
State(diagnostic/offline/online): online
2004-03-26 11:20:53 107 LSM 0,0: online
ACSSA> LSM 0,0 varied online
2.2.9 Logoff from ACSSA (ACSLS Server interface)
ACSSA> logoff
2.3 ACSLS TAPE CLEANING
ACS robot types are self cleaning. Tape cleaning should not be initiated by NetBackup. If a TapeAlert-based cleaning flag is set by LTID or avrd for an ACS, TLH, or an LMF drive, the vmd/DA will not release the drives.
To disable TapeAlert checking and eliminate "TapeAlert is not supported" messages in the syslog, add the NO_TAPEALERT touch file.
For UNIX:
/usr/openv/volmgr/database/NO_TAPEALERT
For Windows:
\volmgr\database\NO_TAPEALERT
The StorageTek library transport control unit tracks how much tape passes through each transport and sends a message to ACSLS when a transport requires cleaning. If auto-cleaning is enabled, ACSLS automatically mounts a cleaning cartridge on the transport. If all the cleaning cartridges have expired (MAX_USAGE), ACSLS will post an error message 376N into the acsss_event log. If auto-cleaning is disabled, ACSLS logs a message in the event log and displays a message at the cmd_proc when cleaning is required.
This option is enabled or disabled using the acsss_config configuration utility. This utility will allow you to specify how the cartridges are ordered for selection and queries.
NOTE: You cannot use the acsss_config configuration program to enable auto-cleaning for drives attached to a SCSI connected library storage module (LSM).
For more information regarding ACSLS tape cleaning, please contact StorageTek.
3.  DEVICE CONFIGURATION FOR ACSLS CONTROLLED TAPE DRIVES      
This section covers device configuration.
   
3.1  DEVICE CONFIGURATION
NOTE: All Automated Cartridge System (ACS) robots configured on a media server must be configured with at least one drive, or the acsd daemon will exit, putting all Automated Cartridge System Library Software (ACSLS) drives in Automatic Volume Recognition (AVR) mode.
3.1.1  SSO Device Configuration for ACSLS
During setup (in the Device Configuration Wizard), NetBackup will attempt to discover the tape drives available to it and, for robot types where serialization is available, their positions within the library.
NetBackup does not yet obtain drive serial numbers from the ACS robotic library control interface, so manual configuration is required. The manual configuration cannot be avoided in a non-Shared Storage Option (non-SSO) environment, where drives are not being shared. Using NetBackup 4.5 FP6, the user can significantly reduce the amount of manual configuration required by following these steps in an SSO environment.
1. Run the device configuration wizard on just one of the hosts where drives in an ACS-controlled library are attached. Let the drives be added as standalone drives.
2. Add the ACS robot definition, and update each drive to indicate its appropriate position in the ACS robot. (Make the drive robotic, and add the ACS, LSM, Panel, and Drive information.) See the VERITAS Media Manager System Administration Guide, Configuring Storage Devices chapter, in the section "Co-relating Device Files to Physical Drives When Adding Drives."
3. Verify the drive paths, if this hasn't already been done in the previous step, based on the documentation referenced above
4. Once the drive paths have been verified on one host, re-run the device configuration wizard, and specify all hosts with ACS drives in the library to be scanned. The device configuration wizard will add the ACS robot definition and the drives to the remaining servers automatically, with correct device paths, assuming that the devices were successfully discovered, along with their serial numbers.
By following the above steps, the time savings can be significant. For example, if there are 20 drives shared on 30 hosts, the above configuration steps require just 20 paths to be manually configured, instead of 600 paths.
3.1.2  Non-SSO Device Configuration for ACSLS
During setup (in the Device Configuration Wizard), NetBackup will attempt to discover the tape drives available to it, and, for robot types where serialization is available, their positions within the library. Do not use the Device Configuration Wizard.  NetBackup does not obtain drive serial numbers from the ACS robotic library control interface, so manual configuration is required.
3.1.3  Initial configuration of StorageTek T9940A and T9940B tape drives in an ACSLS environment
It is advised to separate the two drive types within the NetBackup Media Management device configuration to alleviate density conflicts.
This issue surfaces because ACS treats the T9940A and T9940B drive media as identical, however, the T9940B version writes at a higher density therefore the T9940A drive cannot read a tape written by a T9940B drive. So, when trying to use both drives within a single library, different densities must be used for each drive. The same issue will occur with SDLT220 and SDLT320 drives in the same ACS-based library.
Workaround:
Add the ACS robot to the NetBackup device configuration according to the steps described within the NetBackup Media Manager Device Configuration Guide.
Configure the STK 9940A drives as type hcart and configure the STK 9940B drives as type hcart2. Then define a NetBackup storage unit for each density, hcart and hcart2.
Steps to inventory media for each density type:
1. In the /usr/openv/volmgr/vm.conf file on the server where vmupdate is run, add "IGNORE_WRONG_MEDIA_TYPE" to the end of the file
2. Run /usr/openv/volmgr/bin/vmupdate -rn  -rt acs -acs_stk2p hcart  -- All new media found will be configured as hcart
3. Next, add the STK 9940B media to the robot
4. Finally, run /usr/openv/volmgr/bin/vmupdate -rn  -rt acs -acs_stk2p hcart2  -- All new media found will be configured as hcart2.
When an inventory of ACS robotics is done, NetBackup will receive a vendor media type back as well as the barcode. That vendor media type is mapped to one of the NetBackup media types. The tag "IGNORE_WRONG_MEDIA_TYPE" allows NetBackup to map a single ACS vendor media type to multiple NetBackup media types.
Disadvantages:
1. If for any reason, media is ejected from the library, verification is required when re-injecting media that it goes to the correct media type (hcart for 9940A, hcart2 for 9940B).
2. T9940B drives cannot be used to read the 9940A media; they have to be segregated.
4  ROBTEST FOR ACS LIBRARIES
      This section covers acstest (robtest)
4.1  INVOKING ROBTEST
# /usr/open/volmgr/bin/robtest
Configured robots with local control supporting test utilities:
ACS(0)     ACSLS host = taco
Robot Selection
---------------
1)  ACS 0
2)  none/quit
Enter choice: 1
Robot selected: ACS(0)   ACSLS host = taco, SSI socket = 13741
Invoking robotic test utility:
/usr/openv/volmgr/bin/acstest -r taco -s 13741 -d /dev/rmt/1cbn 0,3,1,0Server 0 with 24
free cells is in state "STATE_RUN"
QUERY SERVER complete
Enter acs commands (? returns help information)
4.2  ROBTEST SYNTAX
?
To exit the utility, type q or Q.
cancel                   - Cancel server request
defpool      - Define scratch pool
delpool                        - Delete empty scratch pool
dm  [|] [f]    - Dismount volume (optionally forced)
drstat                     - Print drive status
eject              - Eject a list of volumes to the specified CAP
enter                        - Enter volumes in the specified CAP
capstat []                   - Print CAP status
varycap  online|offline      - set the state of the given CAP
setmode  automatic|manual    - set the mode of the given CAP
setpriority        - set the priority of the given CAP
m  [|]         - Mount volume
qmmi                                 - Query actual mixed media information
qpool []                       - Query pools
qreq []                  - Query server requests
qscr []                        - Query scratch volumes by pool
qserver                              - Query ACSLS server
qvol []                         - Query volumes
setscr  ON|OFF  []   - Set scratch attributes for volume range
start                                - Start ACS Library Manager (request RUN state)
types                                - Print list of known ACS media types
SCSI commands:
unload |            - Issue SCSI unload
where:
=0-126, =0-23, =0-2, =0-15, =0-127
= [,,,]
= d1 if drive 1, d2 if drive 2, ..., d15 if drive 15
= scratch pool low water mark
= scratch pool high water mark
= ,,
= [::...:]
4.2.1  Command to Obtain Drive Status
drstat
Drive 1 information:
ID (acs,lsm,panel,drv): 0,0,0,0
drive type:             DLT7000
volume ID:              
state:                  STATE_ONLINE
status:                 STATUS_DRIVE_AVAILABLE
Drive 2 information:
ID (acs,lsm,panel,drv): 0,0,0,1
drive type:             DLT7000
volume ID:              
state:                  STATE_ONLINE
status:                 STATUS_DRIVE_AVAILABLE
Drive 3 information:
ID (acs,lsm,panel,drv): 0,0,0,2
drive type:             9840
volume ID:              
state:                  STATE_ONLINE
status:                 STATUS_DRIVE_AVAILABLE
Drive 4 information:
ID (acs,lsm,panel,drv): 0,0,0,3
drive type:             9840
volume ID:              
state:                  STATE_ONLINE
status:                 STATUS_DRIVE_AVAILABLE
DRIVE STATUS complete
4.2.2  Command to Query Volumes
qvol
000002   STK1R      home       0, 0, 1, 0, 0           
000003   STK1R      home       0, 0, 0, 2, 0           
000004   STK1R      home       0, 0, 0, 3, 0           
000005   STK1R      home       0, 0, 1, 5, 0           
000006   STK1R      home       0, 0, 1, 8, 1           
000008   STK1R      home       0, 0, 0, 23, 0         
QUERY VOLUME complete
4.2.3  Command to Mount a Volume
m 000040 0,0,0,2
MOUNT complete
4.2.4  Command to Dismount a Volume
dm 000040 0,0,0,2 f
DISMOUNT complete     
4.3  HOW TO DEFINE ACSLS SCRATCH POOLS AND ADD VOLUMES USING ROBTEST
Start the robtest utility:
On UNIX:
# /usr/openv/volmgr/bin/robtest
On Windows:
veritas\volmgr\bin\robtest.exe
Select the ACS robot. Enter the 'define pool' command as follows:
defpool 4 0 500 1
Scratch pool 4 has been defined
NOTE: DEFINE POOL completes robot inventories for ACS,
where 4 is the pool number
where 0 500 is the low and high water marks
where 1 is overflow on (could be 0 for overflow off)
Next, define ACSLS scratch volumes in this pool:
qpool 4
Pool ID 4 has 0 volumes
QUERY POOL complete
setscr 4 ON 000040 000044
000040   STATUS_SUCCESS
000041   STATUS_SUCCESS
000042   STATUS_SUCCESS
000043   STATUS_SUCCESS
000044   STATUS_SUCCESS
SET SCRATCH complete
qpool 4
Pool ID 4 has 5 volumes
QUERY POOL complete
Quit 'robtest' and perform a normal robot inventory with NetBackup.
5  MEDIA
      This section covers "available_media" script output and tape ejection.
5.1  AVAILABLE MEDIA SCRIPT
The utility /usr/openv/netbackup/bin/goodies/available_media does not report robot slot numbers for ACS libraries. This is not a bug. The information on slot location is managed by ACS and not by Media Manager.
Below is a sample output from the available_media report, which can be generated by running the following command:
Windows:  \netbackup\bin\goodies\available_media
or
UNIX:  /usr/openv/netbackup/bin/goodies/available_media
media    media  robot  robot  robot  side/  ret    size    status
ID       type   type   #      slot   face   level  Kbytes
-----------------------------------------------------------------------------------------------
NetBackup pool
NB0001  DLT      ACS      0      -      -      0    2848    ACTIVE
NB0002  DLT      ACS      0      -      -      0    2848    ACTIVE
ABC234  DLT      ACS      0      -      -      -    -       AVAILABLE
ABC345  DLT      ACS      0      -      -      -    -       AVAILABLE
5.2  HOW TO SPECIFY WHICH MEDIA ACCESS PORT (MAP) TO USE FOR TAPE EJECTION
In the media server's /usr/openv/volmgr/vm.conf file, it is possible to specify the media access port (MAP) to use when ejecting media to a particular ACS robot. If this entry is present, NetBackup (including the Vault extension) will eject to the specified MAP instead of the default 0,0,0 MAP.
The vm.conf entry syntax:
MAP_ID = robot-num map-id
Example: If a user wants the ACS(0) robot to eject via its 0,0,1 MAP and the ACS(1) robot to eject via its 0,1,0 MAP, the following vm.conf entries would be necessary on the media servers that use these robots:
MAP_ID = 0 0,0,1
MAP_ID = 1 0,1,0
6.  COMMUNICATION
This section covers RPC and communication.

6.1  REMOTE PROCEDURE CALL (RPC)
NetBackup uses RPC to connect to the ACSLS server, and rpcbind is the service that converts RPC program numbers into universal addresses. It must be running on the host to be able to make RPC calls on a server on that machine.
6.1.1  How to start RPC on different operating systems
Starting RPC is best accomplished using the operating system vendor startup scripts. The following are the ways of starting RPC on various operating systems.
a. Solaris
    #  /etc/init.d/rpc start
b. HP-UX
    #  /sbin/init.d/Rpcd start
c. AIX
    #  startsrc -s portmap
d. Linux
    #  /etc/rc.d/init.d/portmap start
e. Tru/64
   #  /usr/sbin/portmap
f. Windows
   Click Start | Settings | Control Panel | Administrative Tools | Services. Select Remote Procedure Call (RPC) and click Start.
6.1.2  How to Verify that RPC is Running
The following commands will verify that the rpcbind is active and that the RPC service is functioning between the media server and ACSLS Library Server. From a terminal window on the media server, issue the following command to ACSLS.
# rpcinfo
program version netid     address             service    owner
100000    4    ticots    hotdog.rpc          rpcbind    superuser
100000    3    ticots    hotdog.rpc          rpcbind    superuser
100000    4    ticotsord hotdog.rpc          rpcbind    superuser
100000    3    ticotsord hotdog.rpc          rpcbind    superuser
100000    4    ticlts    hotdog.rpc          rpcbind    superuser
100000    3    ticlts    hotdog.rpc          rpcbind    superuser
100000    4    tcp       0.0.0.0.0.111       rpcbind    superuser
100000    3    tcp       0.0.0.0.0.111       rpcbind    superuser
100000    2    tcp       0.0.0.0.0.111       rpcbind    superuser
100000    4    udp       0.0.0.0.0.111       rpcbind    superuser
100000    3    udp       0.0.0.0.0.111       rpcbind    superuser
100000    2    udp       0.0.0.0.0.111       rpcbind    superuser
If the service is not running, rpcinfo will report: "can't contact rpcbind: RPC: rpcbind failure - RPC: Failed ( unspecified error )"
Examine the /export/home/ACSSS/log/acsss_event.log for "RPC: Rpcbind failure." The error message should include an IP that it is trying to communicate with.  Verify it is the correct IP address.
6.1.3  How to Verify the ACSSS Program Registration
#rpcinfo -t {acsls_hostname} 300031 2
program 300031 version 2 ready and waiting
#rpcinfo -t {acsls_hostname} 300031 1
program 300031 version 1 ready and waiting
You should get a response from both programs, but you only need a response from the version that you are using
(2 = UDP or 3 = TCP). The NetBackup default for this communications service is UDP.
6.1.4  Basic snoop Output
Basic snoop of a query server sent automatically by initiating the robtest utility:
# snoop salid carter
1   0.00000        salad -> carter.min.veritas.com PORTMAP C GETPORT prog=300031 (?) vers=1 proto=UDP
2   0.00108 carter.min.veritas.com -> salad        PORTMAP R GETPORT port=1025
3   0.00056        salad -> carter.min.veritas.com RPC C XID=1066379121 PROG=300031 (?) VERS=1 PROC=1000
4   0.00091 carter.min.veritas.com -> salad        RPC R (#3) XID=1066379121 Success
5   0.00459 carter.min.veritas.com -> salad        RPC C XID=1066114434 PROG=1073741824 (transient) VERS=1 PROC=1000
6   0.00031        salad -> carter.min.veritas.com RPC R (#5) XID=1066114434 Success
7   0.15656 carter.min.veritas.com -> salad        RPC C XID=1065841469 PROG=1073741824 (transient) VERS=1 PROC=1000
8   0.00029        salad -> carter.min.veritas.com RPC R (#7) XID=1065841469 Success
This trace shows that the portmapper and program registration was successful.
For additional assistance with NetBackup options as they pertain to ACSLS, reference the Media Manager System Administrator's Guide.
7  COMMON ACS ERROR MESSAGES
This section covers common ACSLS error messages.
7.1  ACS(2) UNAVAILABLE: INITIALIZATION FAILED: UNABLE TO INITIALIZE ROBOT
Resolution:  Verify the IP address specified for the robotic host is correct in LibAttach on the master server.

7.2  ACS STATUS = 54, STATUS_IPC_FAILURE
Robtest will show:

acs_query_server() failed
Unable to query server taco, ACS status = 54, STATUS_IPC_FAILURE Robotic test utility /usr/openv/volmgr/bin/acstest
returned abnormal exit status (1).
STATUS_PENDING

Cause:  local network interface is down.  

7.3  ACS STATUS = 72, STATUS_PENDING
Example 1:
robtest will show:
acs_response() failed
Unable to obtain Query Server acknowledge response, ACS status = 72, STATUS_PENDING
Robotic test utility /usr/openv/volmgr/bin/acstest
returned abnormal exit status (1).
Media Server event.log will log:
02-05-04 13:49:18 SSI[0]:
ONC RPC: csi_rpccall(): status:STATUS_NI_FAILURE; failed: clntudp_create()
RPC UDP client connection failed, RPC: Rpcbind failure
Remote Internet address:10.82.56.67, Port: 0
Cause:  ACSLS server network interface is down
Example 2:
Media server system log :
Nov 15 11:18:06 hoehpt07 acsd[8807]: ACS(0) Response has not been returned by Mount command sequence 4434, ACS status = 72, STATUS_PENDING
Nov 15 11:28:06 hoehpt07 acsd[8807]: ACS(0) Response has not been returned by Mount command sequence 4434, ACS status = 72, STATUS_PENDING
Event log from the ACSLS :
2003-05-22 11:33:51 CSI[0]:
1022 N csi_net_send.c 1 474
ONC RPC: csi_net_send(): status:STATUS_NI_TIMEDOUT; failed: st_net_send() Cannot send message to NI:discarded, Network timeout Errno = 0 (none) Remote Internet address: 207.169.154.59, Port: 53429
2003-05-22 11:33:51 CSI[0]:
1026 N csi_freeqmem.c 1 142
ONC RPC: csi_freeqmem(): status:STATUS_QUEUE_FAILURE;
Dropping from Queue: Remote Internet address: 207.169.154.59,Port: 53429 , ssi_identifier: 1, Protocol: 2, Connect type: 1
2003-05-22 11:33:51 ACSSA[0]:
1432 N sa_demux.c 1 273
Server System network interface timeout.
Cause:
The errors above indicate that NetBackup is able to reach the ACSLS server with a status request, but the ACSLS server is unable to respond.

By default, the IP address sent to the ACSLS as part of the packet STATUS request is that of the primary hostname of the media server, i.e. the hostname given by uname -a. This error occurs when the ACSLS cannot resolve reverse name or is configured (via routing) to only reach a secondary interface on the media server.

If the issue is failure to do reserve name lookup, add the media server's IP to the domain name service (DNS) reverse tables or to the ACSLS /etc/hosts file.
If the ACSLS server cannot route to the media server's hostname, override the default behavior by using "ACS_SSI_HOSTNAME = " in the /usr/openv/volmgr/vm.conf file where the value for  is a hostname associated with an IP address the ACSLS can reach on the media server.

7.4  STATUS_NI_FAILURE
Explanation of message: STATUS_NI_TIMEDOUT:
The CSI (media server) has timed out waiting for a response from a client (ACSLS). The actual "Waiting to obtain XXXXX ACS sequence YYY acknowledge response" indicates that the daemon has not received an acknowledgment from the LibraryStation module for 30 seconds (which is a hard-coded limit) following a sent command. "Unable to obtain" means it has given up.
The timeouts seen occur after a 30 second delay, when no acknowledgment has been received. These timeout periods are determined by two tunable environment variables:

CSI_RETRY_TIMEOUT - The default for which is 3 seconds, not 2 as described in the Media Management manual.
CSI_RETRY_TRIES - The default for this is 5 retries.

Changes: Add the following to /usr/openv/volmgr/vm.conf on the media server.
                           CSI_RETRY_TIMEOUT=30  
                           CSI_RETRY_TRIES=10
or

Change the OS environment variable CSI_RETRY_TIMEOUT to 30 and CSI_RETRY_TRIES to 10.
From bash/ksh prompt:
#CSI_RETRY_TIMEOUT=30;export CSI_RETRY_TIMEOUT
#CSI_RETRY_TRIES=10;export CSI_RETRY_TRIES

Add to the NetBackup startup script for a permanent solution.

Isolating the problem:
a. Ensure that the media server can successfully ping the ACS server and vice-versa
b. Check that ltid has started the acsd, acsssi and assel service daemons on the VERITAS media server by performing a bpps -a from /usr/openv/netbackup/bin or vmps from /usr/openv/volmgr/bin/volmgr. If acsssi is not running, ensure that RPC is running
c. Run snoop between the media server and ACS. Initiate robtest and check that the query server, sent upon robtest initialization, is responded to by the ACSLS server (reference basic snoop below)
d. Verify the RPC communications between the media server and ACSLS host using the rpcinfo command. The rpcinfo -t  300031 1 h command checks RPC connectivity, portmapper registration and that the ACSLS program (service) is available (reference  'How to start RPC' above).
e. Check the event logs from the media server and library server for errors (reference Logging above)
f.  Check the syslog or event file for errors
g. Enable tracing (reference Logging above)
h.  This error can occur if the users.ALL.allow file on the ACSLS Library Server does not contain an entry for the requesting media server. This file is found on the ACSLS server in the $ACS_HOME/data/external/access_control directory and is used for granting/denying library access. For example, entries of the users.ALL.allow consult the ACSLS administrators guide.

7.4.1  ACS status=104, STATUS_NI_FAILURE
robtest will show:
acs_response() failed
Unable to obtain Query Server acknowledge response, ACS status = 104, STATUS_NI_FAILURE
Robotic test utility /usr/openv/volmgr/bin/acstest
returned abnormal exit status (1).
acsssi event.log:
02-05-04 13:31:54 SSI[0]:
ONC RPC: csi_rpccall(): status:STATUS_NI_FAILURE; failed: clntudp_create()
RPC UDP client connection failed, RPC: Program not registered
Remote Internet address:10.82.56.67, Port: 0;
Cause: ACSLS is down or not responding
7.4.2  ACS status = 105, STATUS_NI_TIMEDOUT
Example of the message log:
Aug 26 17:06:57 sdhra1a acsd[15914]: ACS(0) Unable to obtain Query ACS sequence 3420 acknowledge response, ACS status = 105, STATUS_NI_TIMEDOUT
Aug 26 17:06:57 sdhra1a acsd[5476]: DecodeDismount(): ACS(0) driveid 0,0,10,5, Actual status: Unable to initialize robot
Aug 26 17:06:57 sdhra1a acsd[5476]: ACS(0) going to DOWN state, status: Unable to initialize robot
Aug 26 17:07:15 sdhra1a acsd[15910]: ACS(0) Waiting to obtain Query Drive sequence 3381 acknowledge response, ACS status = 72, STATUS_PENDING
Cause:  The CSI (media server) has timed out waiting for a response from a client (ACSLS).



本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/123/showart_209383.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP