免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
12下一页
最近访问板块 发新帖
查看: 19027 | 回复: 13
打印 上一主题 下一主题

oracle 10g r2 rac安装gsd进程不能启动。 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2006-08-22 18:06 |只看该作者 |倒序浏览
在aix 5.3上安装 10gr2 rac.
运行root.sh,在最后一个node上,出现提示GSD不能启动。各位遇见过吗?一般是什么原因造成的?
# ./root.sh
WARNING: directory '/oracle/product/10.2.0.1' is not owned by root
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/product/10.2.0.1' is not owned by root
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: oracle_a oracle_a_priv oracle_a
node 2: oracle_b oracle_b_priv oracle_b
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
oracle_a
oracle_b
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps

Creating VIP application resource on (2) nodes...
Creating GSD application resource on (2) nodes...
Creating ONS application resource on (2) nodes...
Starting VIP application resource on (2) nodes...
Starting GSD application resource on (2) nodes1:CRS-0215: Could not start
resour
ce 'ora.oracle_a.gsd'.
Check the log file
"/oracle/product/10.2.0.1/crs/log/oracle_a/racg/ora.oracle_a.
gsd.log" for more details
.1:CRS-0215: Could not start resource 'ora.oracle_b.gsd'.
Check the log file

"/oracle/product/10.2.0.1/crs/log/oracle_b/racg/ora.oracle_b.
gsd.log" for more details
..
Starting ONS application resource on (2) nodes...


Done.

论坛徽章:
0
2 [报告]
发表于 2006-08-22 18:11 |只看该作者
看看日志/oracle/product/10.2.0.1/crs/log/oracle_a/racg/ora.oracle_a.
gsd.log和/oracle/product/10.2.0.1/crs/log/oracle_b/racg/ora.oracle_b.
gsd.log
crs_stat有什么结果?
手工启动能不能把gsd起来?

论坛徽章:
0
3 [报告]
发表于 2006-08-22 22:50 |只看该作者

手工启动gsd,也不能启动

# more ora.oracle_b.gsd.log


Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005
Oracle. All rights reserved.
2006-08-21 11:06:22.569: [ RACG][1] [397466][1][ora.oracle_b.gsd]: Failed to
start GSD on local node

2006-08-21 11:06:22.569: [ RACG][1] [397466][1][ora.oracle_b.gsd]:
clsrcexecut: env ORACLE_CONFIG_HOME=/ora
cle/product/10.2.0.1/crs

2006-08-21 11:06:22.569: [ RACG][1] [397466][1][ora.oracle_b.gsd]:
clsrcexecut: cmd = /oracle/product/10.2.
0.1/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540
/oracle/product/10.2.0.1/crs/bin/gsdctl start

2006-08-21 11:06:22.569: [ RACG][1] [397466][1][ora.oracle_b.gsd]:
clsrcexecut: rc = 1, time = 97.134s

2006-08-21 11:06:24.200: [ RACG][1] [397466][1][ora.oracle_b.gsd]: GSD is
not running on the local node

2006-08-21 11:06:24.200: [ RACG][1] [397466][1][ora.oracle_b.gsd]:
clsrcexecut: env ORACLE_CONFIG_HOME=/ora
cle/product/10.2.0.1/crs

2006-08-21 11:06:24.200: [ RACG][1] [397466][1][ora.oracle_b.gsd]:
clsrcexecut: cmd = /oracle/product/10.2.
0.1/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540
/oracle/product/10.2.0.1/crs/bin/gsdctl stat

2006-08-21 11:06:24.200: [ RACG][1] [397466][1][ora.oracle_b.gsd]:
clsrcexecut: rc = 1, time = 1.630s

2006-08-21 11:06:24.200: [ RACG][1] [397466][1][ora.oracle_b.gsd]: end for
resource = ora.oracle_b.gsd, act
ion = start, status = 1, time = 98.875s

2006-08-21 11:06:26.066: [ RACG][1] [524452][1][ora.oracle_b.gsd]: GSD is
not running on the local node

# ./crs_stat
NAME=ora.oracle_a.gsd
TYPE=application
TARGET=ONLINE
STATE=OFFLINE

NAME=ora.oracle_a.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on oracle_a

NAME=ora.oracle_a.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on oracle_a

NAME=ora.oracle_b.gsd
TYPE=application
TARGET=ONLINE
STATE=OFFLINE

NAME=ora.oracle_b.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on oracle_b

NAME=ora.oracle_b.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on oracle_b

# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....e_a.gsd application ONLINE OFFLINE
ora....e_a.ons application ONLINE ONLINE oracle_a
ora....e_a.vip application ONLINE ONLINE oracle_a
ora....e_b.gsd application ONLINE OFFLINE
ora....e_b.ons application ONLINE ONLINE oracle_b
ora....e_b.vip application ONLINE ONLINE oracle_b

论坛徽章:
0
4 [报告]
发表于 2006-08-23 03:56 |只看该作者
1. Need to check your pre-installation tasks. You should not see these message:
WARNING: directory '/oracle/product/10.2.0.1' is not owned by root
WARNING: directory '/oracle/product' is not owned by root
WARNING: directory '/oracle' is not owned by root

2. If you have installed Oracle clusterware on this machine before, you need to
remove all of it. You should not see something like this:
Oracle Cluster Registry configuration upgraded successfully

3. run the utility see if your installation of the clusterware was successful:
runcluvfy.sh stage -post crsinst -n node_list -verbose

论坛徽章:
0
5 [报告]
发表于 2006-08-23 09:33 |只看该作者
gsd进程没有正常启动, 直接运行gsdctl start会有什么结果?  
如果无法正常启动gsd, 可以手工建库, 不影响数据库的使用, 但是无法使用crs的一些命令管理机群数据库.

论坛徽章:
0
6 [报告]
发表于 2006-08-23 09:37 |只看该作者
if u want trace down this problem, pls reference to metalink note268937.1 and  178683.1

论坛徽章:
0
7 [报告]
发表于 2006-08-23 09:39 |只看该作者
题:         Tracing GSD, SRVCTL, GSDCTL, and SRVCONFIG
          文档 ID:         注释:178683.1         类型:         TROUBLESHOOTING
          上次修订日期:         03-MAY-2004         状态:         PUBLISHED


PURPOSE
-------

The Purpose of this document is to assist in debugging SRVCTL, GSD, GSDCTL,
and SRVCONFIG problems.


SCOPE & APPLICATION
-------------------

This document is for support analysts to troubleshoot SRVCTL, GSD, GSDCTL,
and SRVCONFIG issues.


TRACING GSD, SRVCTL, GSDCTL, and SRVCONFIG
------------------------------------------

To provide verbose output for SRVCTL, GSD, GSDCTL, or SRVCONFIG, tracing can
be enabled to provide additional screen output.  

--------------------------------------------------------------------------

10g:

Just set the environment variable SRVM_TRACE to true to trace all of the
SRVM files like gsd, srvctl, and ocrconfig.

--------------------------------------------------------------------------

9i:

To Trace GSD:
-------------
1. vi the gsd.sh file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHome\bin\gsd.bat file and choose Edit.

2. At the end of the file, look for the following line:

  exec $JRE -classpath $CLASSPATH oracle.ops.mgmt.daemon.OPSMDaemon $MY_OHOME

3. Add the following just before the -classpath in the 'exec $JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the gsd.sh file, the string should now look like this:

  exec $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running gsd.sh:

[opcbsol1]/u01/home/usupport> gsd.sh
[main][9:31:8:860] Daemon: argument is /u01/32bit/app/oracle/product/9.0.1
[main][9:31:8:893] tracing is true; at level 2
[main][9:31:8:893] trace file is /u01/32bit/app/oracle/product/9.0.1/srvm/log/gsdaemon.log
cont...


To Trace SRVCTL:
---------------
1. vi the srvctl file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHome\bin\srvctl.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.opsctl.OPSCTLDriver "$@"

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the srvctl file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running srvctl:

[opcbsol1]/u01/home/usupport> srvctl status -p V90321
[main][9:33:2:968] srvctl: tracing is true at level 2
[main][9:33:3:38] Going into GetActiveNodes constructor...
[main][9:33:3:59] Detected Cluster
[main][9:33:3:60] Cluster existence = true
[main][9:33:3:95] loaded library
[main][9:33:3:108] Inside GetActiveNodes.initializeCluster
[main][9:33:3:264] The status string is: 1
[main][9:33:3:265] The result string is: Everything ok So Far 1
cont...


To Trace GSDCTL:
---------------
1. vi the gsdctl file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHome\bin\gsdctl.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.mgmt.daemon.GSDCTLDriver...

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the gsdctl file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running gsdctl:

  [opcbsol1]/u02/32bit/app/oracle/product/9.2.0/bin> gsdctl stat
  [main] [15:41:34:849] [GetActiveNodes.create:Compile]  Going into GetActiveNodes
  [main] [15:41:34:918] [sQueryCluster.<init>:Compile]  Detected Cluster
  [main] [15:41:34:922] [sQueryCluster.isCluster:Compile]  Cluster existence = true
  cont...


To Trace SRVCONFIG:
-------------------
1. vi the srvconfig file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHome\bin\srvconfig.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.mgmt.rawdevice.RawDeviceUtil $*

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the srvconfig file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running srvconfig:

  [opcbsol1]/u02/32bit/app/oracle/product/9.2.0/bin> srvconfig -version
  [main] [16:0:58:395] [RawDeviceUtil.getDeviceName:Compile]  
  [main] [16:0:58:454] [sQueryCluster.<init>:Compile]  Detected Cluster
  [main] [16:0:58:457] [sQueryCluster.isCluster:Compile]  Cluster existence = true
  cont...


RELATED DOCUMENTS
-----------------

Note 178435.1 PRK% Errors - Cause and Action Required  
Note 169454.1 What to do with PRKR-1007, PRKR-1001, PRKO-2008, or PRKR-1020 errors

论坛徽章:
0
8 [报告]
发表于 2006-08-23 09:40 |只看该作者
主题:         Repairing or Restoring an Inconsistent OCR in RAC
          文档 ID:         注释:268937.1         类型:         BULLETIN
          上次修订日期:         07-FEB-2006         状态:         PUBLISHED


PURPOSE
-------

To provide a method of repairing or restoring an inconsistent OCR.


SCOPE & APPLICATION
-------------------

This article is intended for DBAs and Support Engineers who need to correct
an inconsistent OCR (Oracle Configuration Repository).


REPAIRING OR RESTORING AN INCONSISTENT OCR (ORACLE CONFIGURATION REPOSITORY)
----------------------------------------------------------------------------

If you have encountered a condition where you are unable to add or remove a CRS
resource, your OCR file may be inconsistent.  Example error when removing a CRS
resource with SRVCTL:

  PRKS-1028 : Configuration for <CRS resource> does not exist in cluster registry.

  or

  CRS-0210: Could not find resource  

Example error when adding a resource with SRVCTL:

  PRKS-1003 : Failed to register CRS resource...


POSSIBLE CAUSES OF OCR INCONSISTENCY
------------------------------------

1. The "-f" option of srvctl.  Do not use the -f (force) option in srvctl, the
-f flag causes it to ignore any errors and will remove whatever pieces it can
anyway.  If you don't use "-f" on the remove command, then srvctl would stop when
it encountered an error, and would not remove the OCR entries.  You can run
srvctl remove without the -f as many times as you need without causing an
inconsistency.  The only time a "-f" option should be used is if you have
verified that you have an inconsistency in the OCR and you are trying to
correct it.

2. CRS_Unregister was used when it was not needed.  You should use srvctl remove
to remove CRS resources.  CRS_Unregister should only be used if you are trying
to repair an inconsistency.

3. The rootdeletenode.sbs script has a "-f" srvctl command in it.  This is a
known issue.  Best practice is to modify rootdeletenode.sbs to not do a "-f".
4. Another potential cause of inconsistency is running multiple commands to
configure the same object at the same time.  Like multiple "srvctl add service"
commands to add different services to the same database at once.
FILES TO GATHER FOR OCR INCONSISTENCY
-------------------------------------

If you have a reproducable testcase that creates inconsistency in the OCR,
please open a service request and provide the exact steps to Oracle Support
Services.  The following are some files that may need to be reviewed for issues
involving OCR inconsistencies:

- CRS Log files:

  cd $ORA_CRS_HOME
  tar cf /var/backup/crs.tar crs/init crs/log css/init css/log evm/init evm/log srvm/log

- OCR dump file - To get this cd to $ORA_CRS_HOME/bin as the root user and issue
"ocrdump".  This will generate two files (ocrdump.log and OCRDUMPFILE).  

- $ORA_CRS_HOME/bin/crs_stat -u and $ORA_CRS_HOME/bin/crs_stat -p output

- srvctl config output:

  srvctl config database -d <db_name> -a
  srvctl config service -d <db_name> -a
  srvctl config nodeapps -n <node name>
  srvctl config nodeapps -n <node name> -a -g -s -l
  srvctl config asm -n <node name>

- Trace output of SRVCTL commands.  Set the environment variable SRVM_TRACE to
true prior to running srvctl commands.  


FIXING INCONSISTENT OCR FILES
-----------------------------

The following methods can be used to correct an inconsistent OCR file:

Method 1: Repair the OCR
Method 2: Restore the OCR from a backup.
Method 3: Re-install CRS


METHOD 1 - REPAIR THE OCR
-------------------------

If you cannot use srvctl to remove a CRS resource, find out if there is
information missing in CRS or in SRVM.  To do this, run
$ORA_CRS_HOME/bin/crs_stat to see if the CRS resource exists.  Also use
one of the following srvctl commands to see if the resource exists in
SRVM:

  srvctl config database -d <db_name> -a
  srvctl config service -d <db_name> -a
  srvctl config nodeapps -n <node name>
  srvctl config nodeapps -n <node name> -a -g -s -l
  srvctl config asm -n <node name>

You may either see information missing from the srvctl config command
or from the crs_stat command.  If you are unable to find an
inconsistency and you cannot remove the resource with srvctl, open a
service request or proceed to method 2.

SRVCTL CONFIG DATA IS MISSING BUT CRS_STAT INFORMATION IS PRESENT
-----------------------------------------------------------------

If information is missing from srvctl config, you can attempt to use
crs_unregister.  You should only use crs_unregister commands with the
understanding that your OCR may need to be restored anyway.  If
crs_unregister does not work you may need to either restore or re-create
your OCR.  To use crs_unregister:

1. Get the resource name of the resource you are trying to remove with
   $ORA_CRS_HOME/bin/crs_stat.  Example:

   cd $ORA_CRS_HOME/bin
   ./crs_stat

   You will see CRS resources, example of a CRS resource:

   NAME=ora.V10SN.V10SN2.inst
   TYPE=application
   TARGET=ONLINE
   STATE=ONLINE on opcbsol2

2. Attempt to unregister the CRS resource with crs_unregister.  Example:

   cd $ORA_CRS_HOME/bin
   ./crs_unregister ora.V10SN.V10SN2.inst

3. If your original goal was to add a CRS resource, try to use srvctl to add
   the resource.  If your original goal was to remove a CRS resource, verify
   that it is removed from crs_stat and srvctl config.  

If there is still a problem with the OCR, proceed to method 2.

CRS_STAT DATA IS MISSING BUT SRVCTL CONFIG INFORMATION IS PRESENT
-----------------------------------------------------------------

If information is missing from crs_stat, you can attempt to use
srvctl remove -f.  You should only use srvctl remove -f commands with the
understanding that your OCR may need to be restored anyway.  If
srvctl remove -f does not work you may need to either restore or re-create
your OCR.  To use srvctl remove -f:

1. Determine what resource exists in srvctl config but is missing from crs_stat

2. Remove the resource with the -f option in srvctl.

   srvctl remove database -d <database-name> -f
   srvctl remove instance  -d <database-name> [-i <instance-name>] -f
   srvctl remove service -d <database-name> -s <service-name> [-i <instance-name>] -f
   srvctl remove nodeapps -n <node-name>  -f

3. If your original goal was to add a CRS resource, try to use srvctl to add
   the resource.  If your original goal was to remove a CRS resource, verify
   that it is removed from crs_stat and srvctl config.  


METHOD 2 - RESTORE THE OCR FROM A BACKUP
----------------------------------------

If crs_unregister or srvctl remove -f does not fix the OCR problem, you may
need to restore the OCR from a backup.  Oracle automatically takes backups of
the OCR every 4 hours.  Oracle also keeps the last 3 backups, up to 4 hours old,
one day old, and one week old available.  Here are the steps for restoring the
OCR.

1. Find out what time the problem that the inconsistency in the OCR occurred.

2. Find an OCR backup from a time prior to when the inconsistency occurred.  
   To do this cd to $ORA_CRS_HOME/cdata/<cluster name> or run
   $ORA_CRS_HOME/bin/ocrconfig -showbackup.  Example:

   # pwd
   /t02/app/oracle/product/crs/cdata/crs_opcbsol
   # ls -ltr
   total 46560
   -rw-r-----   1 root     root     3960832 Apr 12 19:53 week.ocr
   -rw-r-----   1 root     root     3960832 Apr 13 03:53 day.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 03:54 backup02.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 03:54 day_.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 07:54 backup01.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 11:54 backup00.ocr

3. If you have a backup of the OCR from prior to the time of the inconsistency,
   reboot the nodes in single user mode or runlevel 1.  If you are unable to
   reboot into single user mode for some reason, you can disable CRS with:

   Sun or Linux:

        /etc/init.d/init.crs disable
        /etc/init.d/init.crs stop

   HP-UX or HP Tru64::

        /sbin/init.d/init.crs disable
        /sbin/init.d/init.crs stop

   IBM AIX:

        /etc/init.crs disable
        /etc/init.crs stop

4. After all nodes are rebooted in single user mode and/or you have verified that
   CRS is not running (ps -ef | grep crs), restore the OCR with ocrconfig.  
   Example:

   cd $ORA_CRS_HOME/bin
   ./ocrconfig -restore /t02/app/oracle/product/crs/cdata/crs_opcbsol/week.ocr

5. Re-enable CRS if it was disabled.  Example:

   Sun or Linux:

        /etc/init.d/init.crs enable

   HP-UX or HP Tru64::

        /sbin/init.d/init.crs enable

   IBM AIX:

        /etc/init.crs enable

6. Reboot the nodes.


METHOD 3 - RE-INSTALL CRS
-------------------------

If all else fails, re-install CRS.  Only do this after consulting with Oracle
Support Services and there is no reasonable way to fix the inconsistency.

1. Use Note 239998.1 to completely remove the CRS installation.  

2. Re-install CRS

3. Run the CRS root.sh as prompted at the end of the CRS install.

4. Run the root.sh in the database $ORACLE_HOME to re-run VIPCA.  This will re-
create the VIP, GSD, and ONS resources.

5. Use NETCA to re-add any listeners.

6. Add databases and instances with SRVCTL, syntax is in Note 259301.1


RELATED DOCUMENTS
-----------------

Note 259301.1 CRS and 10g Real Application Clusters
Note 178683.1 - Tracing GSD, SRVCTL, GSDCTL, and SRVCONFIG
Note 239998.1 - 10g RAC How to Clean Up After a Failed CRS Install

论坛徽章:
0
9 [报告]
发表于 2006-08-23 10:37 |只看该作者
直接运行gsdctl start,出现结果为fail to start the GSD.
在安装数据库程序的时候,报错,出现不能安装组件的错误。是不是系统还有其他问题?现在我又重新安装了。
是不是系统的补丁太高了?我现在是AIX5305.

多谢,blue_stone (真水无香)  哥们!

论坛徽章:
0
10 [报告]
发表于 2006-08-24 11:38 |只看该作者

如何解决?有接过吗?

如何解决?有接过吗?
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP