- 论坛徽章:
- 0
|
项目被RHCS卡住了。。。。。。求jerrywjl大哥指导下。集群节点能online但是clustat下面没有资源显示。资源部能启动。
具体情况如下:
[root@udbapp1 ~]# uname -a
Linux udbapp1 2.6.18-53.el5xen #1 SMP Wed Oct 10 16:48:44 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
[root@udbapp1 ~]# more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
119.87.244.70 udbapp1.local udbapp1
119.87.244.69 udbapp2.local udbapp2
#119.87.244.70 udbapp1
#119.87.244.69 udbapp2
[root@udbapp1 ~]#
-------------------------------------------
[root@udbapp2 ~]# more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
119.87.244.70 udbapp1.local udbapp1
119.87.244.69 udbapp2.local udbapp2
#119.87.244.70 udbapp1
#119.87.244.69 udbapp2
上面是/etc/hosts文件。fence用HP ilo。硬件设备连接如下:
HP主机双网口进行bond产生bond0 IP分别为119.87.244.70和69 ilo分别为71和72.网口和ILO口都连接到交换机都能相互ping通。
[root@udbapp2 ~]# fence_ilo -a 119.87.244.71 -l redhat -p redhat123456 -o status
power is ON
success
-------------------
[root@udbapp1 ~]# fence_ilo -a 119.87.244.72 -l redhat -p redhat123456 -o status
power is ON
success
/etc/cluster/cluster.conf文件如下:
<?xml version="1.0" ?>
<cluster config_version="1" name="cluster_2">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="udbapp1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="fence_1"/>
</method>
</fence>
</clusternode>
<clusternode name="udbapp2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="fence_2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1">
<multicast addr="224.0.0.1"/>
</cman>
<fencedevices>
<fencedevice agent="fence_ilo" hostname="119.87.244.71" login="redhat" name="fence_1" passwd="redhat123456"/>
<fencedevice agent="fence_ilo" hostname="119.87.244.72" login="redhat" name="fence_2" passwd="redhat123456"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="udbapp" ordered="1" restricted="0"/>
<failoverdomainnode name="udbapp1" priority="1"/>
</failoverdomains>
<resources>
<ip address="119.87.244.73" monitor_link="1"/>
</resources>
<service autostart="1" domain="udbapp" name="apache" recovery="relocate">
<ip ref="119.87.244.73"/>
<script file="/etc/init.d/httpd" name="httpd"/>
</service>
</rm>
</cluster>
两边同时service cman start的情况:
[root@udbapp1 ~]# service cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing... done
[确定]
clustat显示的情况:
[root@udbapp1 ~]# clustat
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
udbapp1 1 Online, Local
udbapp2 2 Online
显示资源没有启动。
[root@udbapp1 ~]# clusvcadm -e apache -m udbapp1
Member udbapp1 trying to enable service:apache...Success
service:apache is now running on udbapp1
[root@udbapp1 ~]# ps -ef|grep httpd
root 32690 30413 0 09:08 pts/1 00:00:00 grep httpd
仍没有启动资源。。。。
[root@udbapp1 ~]# service rgmanager start
启动 Cluster Service Manager:[确定]
[root@udbapp1 ~]# service rgmanager status
clurgmgrd 已死,但 pid 文件仍存
启动rgmanager 却发现进程已死。。。
下面是启动过程的messages:
[root@udbapp2 ~]# tail -f /var/log/messages
Nov 30 09:11:36 udbapp2 ccsd[20433]: Starting ccsd 2.0.60:
Nov 30 09:11:36 udbapp2 ccsd[20433]: Built: Jan 23 2007 12:42:13
Nov 30 09:11:36 udbapp2 ccsd[20433]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Nov 30 09:11:36 udbapp2 ccsd[20433]: cluster.conf (cluster name = cluster_2, version = 1) found.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] AIS Executive Service RELEASE 'subrev 1324 version 0.80.2'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Copyright (C) 2006 Red Hat, Inc.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] AIS Executive Service: started and ready to provide service.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_cpg loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais cluster closed process group service v1.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_cfg loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais configuration service'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_msg loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais message service B.01.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_lck loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais distributed locking service B.01.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_evt loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais event service B.01.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_ckpt loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais checkpoint service B.01.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_amf loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais availability management framework B.01.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_clm loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais cluster membership service B.01.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_evs loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais extended virtual synchrony service'
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] openais component openais_cman loaded.
Nov 30 09:11:39 udbapp2 openais[20439]: [MAIN ] Registering service handler 'openais CMAN membership service 2.01'
Nov 30 09:11:39 udbapp2 openais[20439]: [TOTEM] Token Timeout (10000 ms) retransmit timeout (495 ms)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] token hold (386 ms) retransmits before loss (20 retrans)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] send threads (0 threads)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] RRP token expired timeout (495 ms)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] RRP token problem counter (2000 ms)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] RRP threshold (10 problem count)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] RRP mode set to none.
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] heartbeat_failures_allowed (0)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] max_network_delay (50 ms)
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] The network interface [119.87.244.69] is now up.
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] Created or loaded sequence id 0.119.87.244.69 for this ring.
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] entering GATHER state from 15.
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais extended virtual synchrony service'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais cluster membership service B.01.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais availability management framework B.01.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais checkpoint service B.01.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais event service B.01.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais distributed locking service B.01.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais message service B.01.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais configuration service'
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais cluster closed process group service v1.01'
Nov 30 09:11:40 udbapp2 ccsd[20433]: Initial status:: Quorate
Nov 30 09:11:40 udbapp2 openais[20439]: [SERV ] Initialising service handler 'openais CMAN membership service 2.01'
Nov 30 09:11:40 udbapp2 openais[20439]: [CMAN ] CMAN 2.0.60 (built Jan 23 2007 12:42:16) started
Nov 30 09:11:40 udbapp2 openais[20439]: [SYNC ] Not using a virtual synchrony filter.
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] Creating commit token because I am the rep.
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] Saving state aru 0 high seq received 0
Nov 30 09:11:40 udbapp2 openais[20439]: [TOTEM] entering COMMIT state.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] entering RECOVERY state.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] position [0] member 119.87.244.69:
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] previous ring seq 0 rep 119.87.244.69
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] aru 0 high delivered 0 received flag 0
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] Did not need to originate any messages in recovery.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] Storing new sequence id for ring 4
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] Sending initial ORF token
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] CLM CONFIGURATION CHANGE
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] New Configuration:
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] Members Left:
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] Members Joined:
Nov 30 09:11:41 udbapp2 openais[20439]: [SYNC ] This node is within the primary component and will provide service.
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] CLM CONFIGURATION CHANGE
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] New Configuration:
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] r(0) ip(119.87.244.69)
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] Members Left:
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] Members Joined:
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] r(0) ip(119.87.244.69)
Nov 30 09:11:41 udbapp2 openais[20439]: [SYNC ] This node is within the primary component and will provide service.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] entering OPERATIONAL state.
Nov 30 09:11:41 udbapp2 openais[20439]: [CMAN ] quorum regained, resuming activity
Nov 30 09:11:41 udbapp2 openais[20439]: [CLM ] got nodejoin message 119.87.244.69
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] entering GATHER state from 11.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] Creating commit token because I am the rep.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] Saving state aru 9 high seq received 9
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] entering COMMIT state.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] entering RECOVERY state.
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] position [0] member 119.87.244.69:
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] previous ring seq 4 rep 119.87.244.69
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] aru 9 high delivered 9 received flag 0
Nov 30 09:11:41 udbapp2 openais[20439]: [TOTEM] position [1] member 119.87.244.70:
Nov 30 09:11:42 udbapp2 openais[20439]: [TOTEM] previous ring seq 4 rep 119.87.244.70
Nov 30 09:11:42 udbapp2 openais[20439]: [TOTEM] aru 9 high delivered 9 received flag 0
Nov 30 09:11:42 udbapp2 openais[20439]: [TOTEM] Did not need to originate any messages in recovery.
Nov 30 09:11:42 udbapp2 openais[20439]: [TOTEM] Storing new sequence id for ring 8
Nov 30 09:11:42 udbapp2 openais[20439]: [TOTEM] Sending initial ORF token
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] CLM CONFIGURATION CHANGE
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] New Configuration:
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] r(0) ip(119.87.244.69)
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] Members Left:
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] Members Joined:
Nov 30 09:11:42 udbapp2 openais[20439]: [SYNC ] This node is within the primary component and will provide service.
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] CLM CONFIGURATION CHANGE
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] New Configuration:
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] r(0) ip(119.87.244.69)
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] r(0) ip(119.87.244.70)
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] Members Left:
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] Members Joined:
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] r(0) ip(119.87.244.70)
Nov 30 09:11:42 udbapp2 openais[20439]: [SYNC ] This node is within the primary component and will provide service.
Nov 30 09:11:42 udbapp2 openais[20439]: [TOTEM] entering OPERATIONAL state.
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] got nodejoin message 119.87.244.69
Nov 30 09:11:42 udbapp2 openais[20439]: [CLM ] got nodejoin message 119.87.244.70
麻烦各位兄弟姐妹分析下messages是否有报错。别人一下子就建立起来了。我这个就是不行。
1.我看文档没有明确说要专有心跳网络。心跳能直接走数据接口。但是有哪个配置文件在设置心跳走哪个网络? |
|