免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
楼主: chiarier
打印 上一主题 下一主题

[Veritas NBU] nbu备份出错,请问什么问题? [复制链接]

论坛徽章:
0
11 [报告]
发表于 2010-10-09 00:55 |只看该作者
/usr/openv/netbackup/bin/admincmd/bpmedia -freeze -m media_id

先冻结指定的介质,再看看是否会用别的磁带备份成功

论坛徽章:
0
12 [报告]
发表于 2010-10-09 09:04 |只看该作者
从bptm中得到如下日志:
11:05:13.497 [25487] <2> bptm: INITIATING (VERBOSE = 0): -w -c newdb2 -den 20 -rt 8 -rn 0 -stunit newdb2-hcart3-robot-tld-0 -cl Newdb2_Ora_arch -bt 1286507103 -b newdb2_1286507103 -st 2 -cj 1 -p OraDataPool -reqid -1286207250 -jm -brm -hostname newdb2 -L /usr/openv/netbackup/logs/user_ops/dbext/logs/24981.0.1286507102 -ru oracle -rclnt newdb2 -rclnthostname newdb2 -rl 3 -rp 2678400 -sl Default-Application-Backup -ct 4 -maxfrag 1048576 -mediasvr newdb2 -no_callback -connect_options 0x01010100 -jobid 15752 -jobgrpid 15752 -masterversion 600000 -shm
11:05:13.507 [25487] <2> bptm: EMMserver_name = newdb2
11:05:13.507 [25487] <2> bptm: EMMserver_port = 1556
11:05:13.507 [25487] <2> main: Setting mud from bp.conf
11:05:13.515 [25487] <4> db_getSTUNIT: emmserver_name = newdb2
11:05:13.515 [25487] <4> db_getSTUNIT: emmserver_port = 1556
11:05:13.527 [25487] <2> VssGetFQDNHostName: vss_auth.cpp.3997: Function: VssGetFQDNHostName. Search name
11:05:13.527 [25487] <2> VssInit: vss_auth.cpp.716: Function: VssInit. Using Cached entries FALSE
11:05:13.528 [25487] <2> Human Message: Optional VxSS libraries not initialized.
11:05:13.528 [25487] <2> VssCleanUp: vss_auth.cpp.854: Function: VssCleanUp result: 21
11:05:13.594 [25487] <2> io_init: using 65536 data buffer size
11:05:13.594 [25487] <2> io_init: CINDEX 0, sched Kbytes for monitoring = 40000
11:05:13.594 [25487] <2> io_set_recvbuf: setting receive network buffer to 32032 bytes
11:05:13.594 [25487] <2> io_init: using 8 data buffers
11:05:13.594 [25487] <2> io_init: child delay = 20, parent delay = 30 (milliseconds)
11:05:13.595 [25487] <2> create_shared_memory: shm_size = 524484, buffer address = 0xfd000000, buf control = 0xfd080000, ready ptr = 0xfd0800c0
11:05:13.605 [25487] <2> setup_bpbkar_info: /usr/openv/netbackup/db/config/shm/newdb2_1286507103 file successfully created
11:05:13.611 [25487] <2> LOCAL CLASS_ATT_DEFS: Product ID = 7
11:05:13.642 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
11:05:13.642 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpdbm
11:05:13.689 [25487] <2> logconnections: BPDBM CONNECT FROM 172.16.17.76.59006 TO 172.16.17.76.13724
11:05:13.849 [25487] <2> nbjm_media_request: Passing job control to NBJM, type WRITE
11:05:13.850 [25487] <2> VssGetFQDNHostName: vss_auth.cpp.3997: Function: VssGetFQDNHostName. Search name
11:05:13.850 [25487] <2> VssInit: vss_auth.cpp.716: Function: VssInit. Using Cached entries FALSE
11:05:13.850 [25487] <2> Human Message: Optional VxSS libraries not initialized.
11:05:13.850 [25487] <2> VssCleanUp: vss_auth.cpp.854: Function: VssCleanUp result: 21
11:05:15.959 [25487] <2> RequestMultipleResources: returning
11:05:15.960 [25487] <2> parse_resource_strings: MEDIADB 1 3352 H898L3 4000016 ------ 20 1283716456 1285527204 1288205604 0 827133376 248 234 3 4 0 256 1024 0 0 0
11:05:15.960 [25487] <2> parse_resource_strings: Parsed message type 15, version 1, 21 parameters
11:05:15.960 [25487] <2> parse_resource_strings: VOLUME 1 H898L3 4000016 MAH898L3 OraDataPool *NULL* *NULL* 24 8 0 8 0 {00000000-0000-0000-0000-000000000000} 0
11:05:15.960 [25487] <2> parse_resource_strings: Parsed message type 16, version 1, 14 parameters
11:05:15.960 [25487] <2> parse_resource_strings: DRIVE 2 HP.ULTRIUM3-SCSI.000 2000014 HU10548Y55 /dev/rmt/0cbn -1 -1 -1 -1 0 0 0 0 *NULL* *NULL* *NULL* *NULL* 1 0
11:05:15.960 [25487] <2> parse_resource_strings: Parsed message type 17, version 2, 19 parameters
11:05:15.960 [25487] <2> parse_resource_strings: STORAGE 0 newdb2-hcart3-robot-tld-0 20 1048576
11:05:15.960 [25487] <2> parse_resource_strings: Parsed message type 18, version 0, 4 parameters
11:05:15.960 [25487] <2> nbjm_media_request: Job control returned to BPTM
11:05:15.960 [25487] <2> drivename_open: Called with Create 1, file HP.ULTRIUM3-SCSI.000
11:05:15.960 [25487] <2> drivename_lock: lock established
11:05:15.960 [25487] <2> drivename_write: Called with mode 0
11:05:15.982 [25487] <2> mount_open_media: Waiting for mount of media id H898L3 (copy 1) on server newdb2.
11:05:15.989 [25487] <4> create_tpreq_file: symlink to path /dev/rmt/0cbn
11:05:15.998 [25487] <2> manage_scsi_reserve: SCSI RESERVE
11:05:16.002 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:05:16.002 [25487] <2> send_MDS_msg: OP_STATUS 0 3352 newdb2 8196 5 0 0 0 0 0 0 *NULL* 0
11:05:16.003 [25487] <2> JobInst::sendIrmMsg: returning
11:05:16.003 [25487] <2> send_operation_error: Decoded status = 4 from 5
11:05:16.049 [25487] <2> log_media_error: successfully wrote to error file - 10/08/10 11:05:16 H898L3 -1 OPEN_ERROR HP.ULTRIUM3-SCSI.000
11:05:16.049 [25487] <4> expandpath: /usr/openv/netbackup/db/media/tpreq/drive_HP.ULTRIUM3-SCSI.000
11:09:16.272 [2513] <2> bptm: INITIATING (VERBOSE = 0): -rptdrv -jobid -1286207251 -jm
11:09:16.280 [2513] <2> bptm: EMMserver_name = newdb2
11:09:16.280 [2513] <2> bptm: EMMserver_port = 1556
11:09:16.281 [2513] <2> drivename_open: Called with Create 0, file HP.ULTRIUM3-SCSI.000
11:09:16.281 [2513] <2> drivename_checklock: Called
11:09:16.281 [2513] <2> drivename_checklock: PID 25487 has lock
11:09:16.281 [2513] <2> report_drives: MODE = 0
11:09:16.281 [2513] <2> report_drives: TIME = 1286507115
11:09:16.281 [2513] <2> report_drives: MASTER = newdb2
11:09:16.281 [2513] <2> report_drives: PATH = /dev/rmt/0cbn
11:09:16.281 [2513] <2> report_drives: MEDIA = H898L3
11:09:16.281 [2513] <2> report_drives: REQID = -1286207250
11:09:16.281 [2513] <2> report_drives: ALOCID = 3352
11:09:16.281 [2513] <2> report_drives: RBID = {D8C5B6AA-1DD1-11B2-90AC-00144F266641}
11:09:16.281 [2513] <2> report_drives: PID = 25487
11:09:16.281 [2513] <2> report_drives: FILE = /usr/openv/netbackup/db/media/tpreq/drive_HP.ULTRIUM3-SCSI.000
11:09:16.282 [2513] <2> main: Sending [EXIT STATUS 0] to NBJM
11:09:16.282 [2513] <2> bptm: EXITING with status 0 <----------
11:09:22.094 [2700] <2> bptm: INITIATING (VERBOSE = 0): -delete_expired
11:09:22.102 [2700] <2> bptm: EXITING with status 0 <----------
11:09:22.562 [2738] <2> bptm: INITIATING (VERBOSE = 0): -delete_all_expired
11:09:22.571 [2738] <2> bptm: EMMserver_name = newdb2
11:09:22.571 [2738] <2> bptm: EMMserver_port = 1556
11:09:22.584 [2738] <2> VssGetFQDNHostName: vss_auth.cpp.3997: Function: VssGetFQDNHostName. Search name
11:09:22.584 [2738] <2> VssInit: vss_auth.cpp.716: Function: VssInit. Using Cached entries FALSE
11:09:22.585 [2738] <2> Human Message: Optional VxSS libraries not initialized.
11:09:22.585 [2738] <2> VssCleanUp: vss_auth.cpp.854: Function: VssCleanUp result: 21
11:09:22.678 [2738] <2> bptm: EXITING with status 0 <----------
11:12:11.100 [25487] <2> send_MDS_msg: OP_STATUS 0 3352 newdb2 9 1 0 0 0 0 0 0 *NULL* 0
11:12:11.102 [25487] <2> JobInst::sendIrmMsg: returning
11:12:11.102 [25487] <2> send_operation_error: Decoded status = 9 from 1
11:12:11.127 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
11:12:11.127 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpjobd
11:12:11.171 [25487] <2> logconnections: BPJOBD CONNECT FROM 172.16.17.76.59061 TO 172.16.17.76.13724
11:12:11.172 [25487] <2> job_authenticate_connection: ignoring VxSS authentication check for now...
11:12:11.172 [25487] <2> job_connect: Connected to the host newdb2 contype 10 jobid <15752> socket <13>
11:12:11.172 [25487] <2> job_connect: Connected on port 59061
11:12:11.220 [25487] <2> job_monitoring_exex: ACK disconnect
11:12:11.220 [25487] <2> job_disconnect: Disconnected
11:12:11.221 [25487] <2> db_error_add_to_file: dberrorq.c:midnite = 1286467200
11:12:11.249 [25487] <16> mount_open_media: error requesting media, TpErrno = Robot operation failed
11:12:11.249 [25487] <4> create_tpreq_file: symlink to path /dev/rmt/0cbn
11:12:11.256 [25487] <2> manage_scsi_reserve: SCSI RELEASE
11:12:11.260 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:16.253 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:21.253 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:26.253 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:31.252 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:36.252 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:41.252 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:46.252 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:51.251 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:12:56.251 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:01.251 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:06.251 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:11.251 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:16.250 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:21.250 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:26.250 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:13:31.247 [25487] <2> drivename_write: Called with mode 1
11:13:31.247 [25487] <2> drivename_unlock: unlocked
11:13:31.247 [25487] <2> drivename_close: Called
11:13:31.284 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
11:13:31.284 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpjobd
11:13:31.328 [25487] <2> logconnections: BPJOBD CONNECT FROM 172.16.17.76.59066 TO 172.16.17.76.13724
11:13:31.328 [25487] <2> job_authenticate_connection: ignoring VxSS authentication check for now...
11:13:31.328 [25487] <2> job_connect: Connected to the host newdb2 contype 10 jobid <15752> socket <13>
11:13:31.328 [25487] <2> job_connect: Connected on port 59066
11:13:31.377 [25487] <2> job_monitoring_exex: ACK disconnect
11:13:31.377 [25487] <2> job_disconnect: Disconnected
11:13:31.377 [25487] <2> db_error_add_to_file: dberrorq.c:midnite = 1286467200
11:13:31.406 [25487] <8> write_backup: media id H898L3 load operation reported an error
11:13:31.406 [25487] <2> nbjm_media_request: Passing job control to NBJM, type FAIL
11:13:32.469 [9885] <2> bptm: INITIATING (VERBOSE = 0): -unload -dn HP.ULTRIUM3-SCSI.000 -dp /dev/rmt/0cbn -dk 2000014 -m H898L3 -mk 4000016 -mds 8 -alocid 3352 -jobid -1286207252 -jm
11:13:32.478 [9885] <2> bptm: EMMserver_name = newdb2
11:13:32.478 [9885] <2> bptm: EMMserver_port = 1556
11:13:32.479 [9885] <2> send_brm_msg: PID of bpxm = 9885
11:13:32.479 [9885] <2> nbjm_media_request: Passing job control to NBJM, type UNLOAD
11:13:32.490 [9885] <2> VssGetFQDNHostName: vss_auth.cpp.3997: Function: VssGetFQDNHostName. Search name
11:13:32.490 [9885] <2> VssInit: vss_auth.cpp.716: Function: VssInit. Using Cached entries FALSE
11:13:32.491 [9885] <2> Human Message: Optional VxSS libraries not initialized.
11:13:32.491 [9885] <2> VssCleanUp: vss_auth.cpp.854: Function: VssCleanUp result: 21
11:13:32.496 [9885] <2> taolog: can't register signal handler
11:13:34.638 [9885] <2> RequestMultipleResources: returning
11:13:34.638 [9885] <2> parse_resource_strings: MEDIADB 1 3352 H898L3 4000016 ------ 20 1283716456 1285527204 1288205604 0 827133376 248 234 3 4 0 256 1024 0 0 0
11:13:34.638 [9885] <2> parse_resource_strings: Parsed message type 15, version 1, 21 parameters
11:13:34.638 [9885] <2> parse_resource_strings: VOLUME 1 H898L3 4000016 MAH898L3 OraDataPool *NULL* *NULL* 24 8 0 8 0 {00000000-0000-0000-0000-000000000000} 0
11:13:34.638 [9885] <2> parse_resource_strings: Parsed message type 16, version 1, 14 parameters
11:13:34.638 [9885] <2> parse_resource_strings: DRIVE 2 HP.ULTRIUM3-SCSI.000 2000014 HU10548Y55 /dev/rmt/0cbn -1 -1 -1 -1 0 0 0 0 *NULL* *NULL* *NULL* *NULL* 1 0
11:13:34.638 [9885] <2> parse_resource_strings: Parsed message type 17, version 2, 19 parameters
11:13:34.638 [9885] <2> parse_resource_strings: STORAGE 0 newdb2-hcart3-robot-tld-0 20 1048576
11:13:34.638 [9885] <2> parse_resource_strings: Parsed message type 18, version 0, 4 parameters
11:13:34.638 [9885] <2> nbjm_media_request: Job control returned to BPTM
11:13:34.638 [9885] <2> drivename_open: Called with Create 1, file HP.ULTRIUM3-SCSI.000
11:13:34.638 [9885] <2> drivename_lock: lock established
11:13:34.639 [9885] <4> create_tpreq_file: symlink to path /dev/rmt/0cbn
11:13:34.648 [9885] <2> drivename_write: Called with mode 2
11:13:34.653 [9885] <2> process_tapealert: TapeAlert returned 0x00000000 0x00000000 (from tapealert_and_release)
11:13:34.653 [9885] <2> really_tpunmount: tpunmount'ing /usr/openv/netbackup/db/media/tpreq/drive_HP.ULTRIUM3-SCSI.000
11:13:36.146 [9885] <4> create_tpreq_file: symlink to path /dev/rmt/0cbn
11:13:36.156 [9885] <2> process_tapealert: TapeAlert returned 0x00000000 0x00000000 (from tapealert_and_release)
11:13:36.158 [9885] <2> tapealert_and_release: SCSI RELEASE
11:13:36.161 [9885] <4> tapealert_and_release: failed opening device with O_NDELAY
11:13:36.161 [9885] <2> drivename_unlock: unlocked
11:13:36.161 [9885] <2> drivename_close: Called
11:13:36.161 [9885] <2> drivename_remove: Called
11:13:36.161 [9885] <2> main: Sending [EXIT STATUS 0] to NBJM
11:13:36.162 [9885] <2> bptm: EXITING with status 0 <----------
11:13:36.697 [25487] <2> requestFailed: got gotCallback, failure status = [800]
11:13:37.437 [25487] <16> RequestMultipleResources: MultiResReq.cpp:1435 resource request failed [800]
11:13:37.437 [25487] <2> RequestMultipleResources: returning
11:13:37.437 [25487] <4> nbjm_media_request: Error from RequestMultipleResources, Master newdb2, returned error 800
11:13:37.461 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
11:13:37.461 [25487] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: service: bpjobd
11:13:37.508 [25487] <2> logconnections: BPJOBD CONNECT FROM 172.16.17.76.59084 TO 172.16.17.76.13724
11:13:37.508 [25487] <2> job_authenticate_connection: ignoring VxSS authentication check for now...
11:13:37.508 [25487] <2> job_connect: Connected to the host newdb2 contype 10 jobid <15752> socket <13>
11:13:37.508 [25487] <2> job_connect: Connected on port 59084
11:13:37.556 [25487] <2> job_monitoring_exex: ACK disconnect
11:13:37.557 [25487] <2> job_disconnect: Disconnected
11:13:37.557 [25487] <2> db_error_add_to_file: dberrorq.c:midnite = 1286467200
11:13:37.568 [25487] <16> nbjm_media_request: NBJM returned an extended error status: resource request failed (800)
11:13:37.568 [25487] <2> send_MDS_msg: OP_STATUS 0 3352 newdb2 8211 5 0 0 0 0 0 0 *NULL* 0
11:13:37.569 [25487] <2> JobInst::sendIrmMsg: returning
11:13:37.569 [25487] <2> send_operation_error: Decoded status = 19 from 5
11:13:37.573 [25487] <2> bptm: Calling tpunmount for media H898L3
11:13:37.573 [25487] <2> send_MDS_msg: MEDIA_DONE 0 -1286207250 0 H898L3 4000016 180
11:13:37.579 [25487] <2> JobInst::sendIrmMsg: returning
11:13:37.586 [25487] <2> bptm: EXITING with status 800 <----------

其中:
11:05:15.998 [25487] <2> manage_scsi_reserve: SCSI RESERVE
11:05:16.002 [25487] <4> manage_scsi_reserve: failed opening device with O_NDELAY
11:05:16.002 [25487] <2> send_MDS_msg: OP_STATUS 0 3352 newdb2 8196 5 0 0 0 0 0 0 *NULL* 0
11:05:16.003 [25487] <2> JobInst::sendIrmMsg: returning
11:05:16.003 [25487] <2> send_operation_error: Decoded status = 4 from 5
11:05:16.049 [25487] <2> log_media_error: successfully wrote to error file - 10/08/10 11:05:16 H898L3 -1 OPEN_ERROR HP.ULTRIUM3-SCSI.000

在打开/dev/rmt/0cbn出错,有人介绍说通过删除/dev/rmt/下的内容,然后devfsadm -c tapes重新获取磁带库。但是按照此指示操作后,/dev/rmt/依旧为空,找不到0cbn,并且通过tpconfig -d也看不到带库驱动的状态。
最后没办法只能手工将0cbn连接到具体的目标devices上,通过tpconfig -d 可以看到具体的驱动状态,但是备份数据还是不行。


另之前做过直接往磁带上备份数据的测试,报错说/dev/rmt/0cbn不能open.

论坛徽章:
0
13 [报告]
发表于 2010-10-09 09:16 |只看该作者
回复11楼:
已将源盘删掉过,nbu可以找到别的盘,错误还是一样。

最开始还怀疑磁头的原因,也用清洗带清洗过几次

论坛徽章:
0
14 [报告]
发表于 2010-10-09 10:04 |只看该作者
#mt -f /dev/rmt/0 status
/dev/rmt/0: No such device or address
#

论坛徽章:
0
15 [报告]
发表于 2010-10-09 11:33 |只看该作者
回复 12# chiarier


   
删除之前有备份么?如果有可以恢复回去
1、关闭NBU /usr/openv/netbackup/bin/bp.kill_all  
2、重启磁带库
3、启动NBU /usr/openv/netbackup/bin/bp.start_all
4、测试备份文件系统到磁带是否正常

论坛徽章:
0
16 [报告]
发表于 2010-10-09 11:57 |只看该作者
回复 15# onion


    这个测试过,重启nbu并且重新启动磁带库,都没有解决.

论坛徽章:
0
17 [报告]
发表于 2010-10-09 14:16 |只看该作者
有其他带库吗?接一个上去试试

论坛徽章:
0
18 [报告]
发表于 2010-10-09 16:22 |只看该作者
用robtest测试,可以将磁带移入到drive中,也可从drive中移出。 证明物理链路是正常的,有可能是NBU把driver 锁住了
可否贴出以下命令内容?
mt -f /dev/rmt/0cbn  status
/usr/openv/
/usr/openv/volmgr/bin/tpconfig -d
/usr/openv/volmgr/bin/vmdareq -a

论坛徽章:
0
19 [报告]
发表于 2010-10-12 18:03 |只看该作者
回复 18# onion


    #mt -f /dev/rmt/0cbn status
/dev/rmt/0cbn: No such device or address

#tpconfig -d
过去是可以显示driver up的,现在已经彻底down了

论坛徽章:
0
20 [报告]
发表于 2010-10-12 18:05 |只看该作者
这两天有点忙没上来更新,见谅。
前天找服务商查了一下,服务商说连接线缆坏了,等线缆到位,更换后把结果告诉大家。
多谢各位!
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP