论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2008-08-22 23:13 |只看该作者 |倒序浏览

英文，希望有所帮助。

I encountered this problem today and thought it would be important to share the situation and fix with you.

I've got 1 node at HACMP software level 5.4.1.2 and the other is at 5.4.1.3 (both levels have the problem - but 5.4.1.1 doesn't)

In a 2 node cluster - when 1 node is down (either clcomdES or AIX is shutdown) and the resource groups are OFFLINE - you will try to bring a resource group online through smit.
This will not work.

You will see this error:

┌──────────────────────────────────────────────────────────────────────────┐
│ Select a Resource Group │
│ │
│ Move cursor to desired item and press Enter. │
│ │
│ migcheck[471]: cl_connect() error, nodename=coenim, rc=-1 │
│ hacmprg OFFLINE │
│ │

Regardless of which line you select the next error will be:

1800-018 There are currently no additional
SMIT selector screen entries available for this
item. This item may require installation of
additional software before it can be accessed.

smit will exit and the resource group will still be in the OFFLINE state.

This is not good.

The smit.log contains:
(Specified sm_name_hdr with
id = "cl_resgrp_start.select_node_migcheck[471]",
was not found in SMIT/ODM database.)

The reason for this problem is below:

Injecting CMVC Defect/Feature: 642271
Analysis text: In 5.4.1.2, the fix for defect 642271/APAR IZ21214
is introduced to differentiate/add granularity to the various
errors which can occur in cl_migcheck ANY. Now, the error 1
indicates a migration is in progress, so do not run automatic
cluster verification/synchronization. An error 3 indicates
either socket, socket connection, or read communication error
in clcomd, so continue running automatic cluster
verification/synchronization. However this change will print
error message like "migcheck[471]: cl_connect() error,
nodename=akash22, rc=-1". This additional error message is
causing additional issues while loading smit menu.

At this point you've got 3 choices:
1. wait until APAR IZ29033 comes out (no ETA for it).
2. modify /usr/es/sbin/cluster/utilities/clRGmove

Change the line:
cl_migcheck ANY

To read:
cl_migcheck ANY 2>/dev/null

Then re-issue the C-SPOC command via smit and it will work (you will be able to bring the resource group online).

3. run clRGmove manually at the command line to bring the RG ONLINE:

/usr/es/sbin/cluster/utilities/clRGmove -s 'false' -u -i -g 'RGname' -n 'nodeX' (where RGname=your RG and nodeX=HACMP node name)

文库|博客

返回列表

Chinaunix › 论坛 › 操作系统 › AIX › ZT》 HACMP - When 1 node is down

[HACMP集群] ZT》 HACMP - When 1 node is down [复制链接]

浏览过的版块