免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 6353 | 回复: 6
打印 上一主题 下一主题

[Veritas NBU] SUSE11下NBU7.1。MEDIA在异常掉电后启动服务发现重删引擎无法正常工作 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2011-08-24 11:01 |只看该作者 |倒序浏览
本帖最后由 savage_jam 于 2011-08-24 11:04 编辑

环境
SUSE11下NBU7.1,master和media各一台机架服务器。Ad存储池在master上,重删存储池在media上。

现象
正常工作中MEDIA在异常掉电后启动服务发现重删引擎无法正常工作,备份作业失败报无可用的存储单元(213),见下

media525:/etc/init.d # service netbackup stop
stopping the NetBackup Service Monitor
stopping the NetBackup Service Layer
stopping the NetBackup Remote Monitoring Management System
stopping the NetBackup compatibility daemon
stopping the Media Manager volume daemon
stopping the NetBackup Deduplication Manager
stopping the NetBackup client daemon
stopping the NetBackup network daemon
media525:/etc/init.d # service netbackup start
NetBackup Authentication daemon started.
NetBackup network daemon started.
NetBackup client daemon started.
NetBackup SAN Client Fibre Transport daemon started.
NetBackup Database Server started.
NetBackup Authorization daemon started.
NetBackup Event Manager started.
NetBackup Audit Manager started.
NetBackup Deduplication Manager started.
NetBackup Deduplication Engine not started.
NetBackup Enterprise Media Manager started.
NetBackup Resource Broker started.
Rebuilding device nodes.
Media Manager daemons started.
NetBackup request daemon started.
NetBackup compatibility daemon started.
NetBackup Job Manager started.
NetBackup Policy Execution Manager started.
NetBackup Storage Lifecycle Manager started.
NetBackup Remote Monitoring Management System started.
NetBackup Key Management daemon started.
NetBackup Service Layer started.
NetBackup Agent Request Server started.
NetBackup Bare Metal Restore daemon not started.
NetBackup Vault daemon started.
NetBackup Service Monitor started.
NetBackup Bare Metal Restore Boot Server daemon started.

在出现该问题之后,
data下的bin和bhd数据被我错误的全删掉了,使用下面的命令报错
meida525:/usr/openv/pdde/pdcr/bin # ./spoold --trace
Error [139888106120960]: 25002: _storeCheckContainers: container index file /pdvol/data/64.bhd is missing
Error [139888106120960]: 25002: Could not initialize DataStore Manager
Error [139888106120960]: 26016: Store Manager: Activate failure.


meida525:/pdvol/data # ls
.cnf  .cnt  .identity  dcdelinfo.log  journal  

meida525:/usr/openv/netbackup/logs/bpdbm # cat log.082411
10:19:08.299 [18166] <16> db_error_add_to_file: wait_lock(/usr/openv/netbackup/db/error/errordb.lock) failed: -1, errno: 2 (No such file or directory)
10:19:08.299 [18166] <4> bpdbm: INITIATING bpdbm: NetBackup 7.1 2011020316 on meida525 IDIRSTRUCT=2 (VERBOSE = 0)
10:19:08.301 [18166] <2> bpdbm: meida525 is not the primary server master522...exiting


meida525:/usr/openv/netbackup/logs/bplist # cat log.082411
00:00:29.055 [2789] <2> logparams: -clean_old_logs
00:00:29.072 [2789] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
00:00:29.072 [2789] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/16d/34c5176d+bprd,1,20,2,1,0+master522.txt
00:00:29.075 [2789] <2> vnet_pbxConnect: pbxConnectEx Succeeded
00:00:29.075 [2789] <2> logconnections: BPRD CONNECT FROM 129.31.66.25.46612 TO 129.31.66.22.1556 fd = 4
00:00:29.077 [2789] <2> delete_log_files:
00:00:29.077 [2789] <2> delete_log_files:
00:00:29.077 [2789] <2> delete_log_files:
00:00:29.087 [2789] <2> delete_log_files: deleted 0 logs > 28 days old
09:51:00.490 [16599] <2> logparams: -resync_host_cache 120
09:51:00.492 [16599] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
09:51:00.492 [16599] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/06b/c413626b+0,1,402,0,1,0+localhost.txt
10:06:17.100 [16940] <2> logparams: -resync_host_cache 120
10:18:02.390 [17653] <2> logparams: -resync_host_cache 120
10:18:02.517 [17677] <2> logparams: -is_local_host master522
10:18:02.517 [17677] <2> bpclntcmd_main: master522 is not a local host: 48
10:18:02.528 [17683] <2> logparams: -is_local_host master522
10:18:02.528 [17683] <2> bpclntcmd_main: master522 is not a local host: 48
10:34:00.726 [19523] <2> logparams: -resync_host_cache 120
10:50:00.644 [19948] <2> logparams: -resync_host_cache 120

请问可不可以通过初始化数据库的index或怎么样去恢复重删存储单元的状态,万分感谢!

论坛徽章:
1
CU十二周年纪念徽章
日期:2013-10-24 15:41:34
2 [报告]
发表于 2011-08-24 11:57 |只看该作者
你是要恢复被删除的存储单元,还是想清一下状态?

论坛徽章:
0
3 [报告]
发表于 2011-08-24 14:01 |只看该作者
回复 2# 无牙


没想过恢复meida的数据,只求存储单元的恢复。版主大人指条明路吧

论坛徽章:
0
4 [报告]
发表于 2011-08-24 22:39 |只看该作者
NetBackup Deduplication Engine not started.

you can check spoold.log

maybe you will find "can't move  ******** to #########"
then
you can mv  ******** to another space.
restart NBU service. you will find de-dup work normal

论坛徽章:
0
5 [报告]
发表于 2011-08-25 15:58 |只看该作者
回复 4# 赵大少爷


    log中未见您提及的字段,只有

# cat /pdvol/log/spoold/spoold.log

August 23 16:14:44 INFO [140501870786304]: DataStore Manager: initializing
August 23 16:14:44 INFO [140501870786304]: Task thread stack size: 0 bytes
August 23 16:14:44 ERR [140501870786304]: 25002: _storeCheckContainers: container index file /pdvol/data/1001.bhd is missing
August 23 16:14:44 ERR [140501870786304]: 25002: Could not initialize DataStore Manager
August 23 16:14:44 INFO [140501870786304]: Server is Version 6.0600.0011.011, Protocol Version 6.6.1
August 23 16:14:44 ERR [140501870786304]: 26016: Store Manager: Activate failure.
August 23 16:15:44 INFO [139897626584832]: set thread[ [139897626584832]: ] max log size to 10000000
August 23 16:15:44 INFO [139897626584832]: set entire process max log size to 10000000
August 23 16:15:44 INFO [139897626584832]: Successfully loaded configuration from /pdvol/etc/puredisk/contentrouter.cfg
August 23 16:15:44 INFO [139897626584832]: Startup: Symantec PureDisk Content Router Version 6.0600.0011.0117.
August 23 16:15:44 INFO [139897626584832]: Startup: using Symantec: libdct 6.0.0.0, July 7, 2004
August 23 16:15:44 INFO [139897626584832]: Startup: using Symantec PureDisk: libcr 6.1.0.0, December 13, 2006
August 23 16:15:44 INFO [139897626584832]: Memory Manager: initializing
August 23 16:15:44 INFO [139897626584832]: Memory Manager: initialization complete

目前如果不创建bin和bhd文件进行欺骗的话,还有什么方法解决该问题?初始化内部的数据库?直接删除存储池也报错【Faileded to delete disk pool diskpool525:invalid command parameter(20) DSM has found that an association still exists between a storage unit and the disk pool:test-media ==>diskpool525(PureDisk)@media522  (MM Status 20)】不知道解决方法,确实做的太粗浅了,需要大家指导一下。

论坛徽章:
0
6 [报告]
发表于 2011-08-25 17:33 |只看该作者
dp删除需要先down了dv和dp吧
然后删了dv再删dp

欺骗的方法应该可行

论坛徽章:
1
CU十二周年纪念徽章
日期:2013-10-24 15:41:34
7 [报告]
发表于 2011-08-25 22:14 |只看该作者
先把NBU的服务都停了。

然后初始化配置文件:
cp -fP /usr/openv/pdde/pdconfigure/cfg/userconfigs/pdregistry.cfg /etc/pdregistry.cfg

删除MSDP创建时的目录
rm  -rf /pdvol/

在master server上删除STU,Storage pool,storage server。

然后重建。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP