免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
123下一页
最近访问板块 发新帖
查看: 15462 | 回复: 26
打印 上一主题 下一主题

cman 无法启动 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2008-10-24 16:56 |只看该作者 |倒序浏览
cman 无法启动,而且还没有具体原因。请教各位是为什么。

相关信息如下:

[root@cms2 ~]# service cman restart
Stopping cluster:
   Stopping fencing... done
   Stopping cman... done
   Stopping ccsd... done
   Unmounting configfs... done
[  OK  ]
Starting cluster:
   Loading modules... done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... failed

[FAILED]


message的日志如下:
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] heartbeat_failures_allowed (0)
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] max_network_delay (50 ms)
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] The network interface [192.168.201.2] is now up.
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] Created or loaded sequence id 0.192.168.201.2 for this ring.
Oct 24 16:24:57 cms2 openais[18831]: [TOTEM] entering GATHER state from 15.
Oct 24 16:24:57 cms2 openais[18831]: [SERV ] Initialising service handler 'openais extended virtual synchrony service'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais cluster membership service B.01.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais availability management framework B.01.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais checkpoint service B.01.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais event service B.01.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais distributed locking service B.01.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais message service B.01.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais configuration service'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais cluster closed process group service v1.01'
Oct 24 16:24:58 cms2 openais[18831]: [SERV ] Initialising service handler 'openais CMAN membership service 2.01'
Oct 24 16:24:58 cms2 openais[18831]: [CMAN ] CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
Oct 24 16:24:58 cms2 openais[18831]: [SYNC ] Not using a virtual synchrony filter.
Oct 24 16:24:58 cms2 openais[18831]: [IPC  ] ERROR: Could not bind AF_UNIX: Address already in use.
Oct 24 16:24:58 cms2 openais[18831]: [MAIN ] AIS Executive exiting (-7).
Oct 24 16:24:59 cms2 ccsd[18750]: Unable to connect to cluster infrastructure after 30 seconds.

论坛徽章:
0
2 [报告]
发表于 2008-10-25 01:50 |只看该作者
Oct 24 16:24:58 cms2 openais[18831]: [IPC  ] ERROR: Could not bind AF_UNIX: Address already in use.

论坛徽章:
0
3 [报告]
发表于 2008-10-25 09:00 |只看该作者
原帖由 hmqq 于 2008-10-25 01:50 发表
Oct 24 16:24:58 cms2 openais[18831]:  ERROR: Could not bind AF_UNIX: Address already in use.


请问这是指那个address呢?

论坛徽章:
0
4 [报告]
发表于 2008-10-25 09:48 |只看该作者

回复 #3 oioilu 的帖子

请问这是指那个address呢 port number(s)
cman udp 5404 and 5405

iptables -L

iptables -F
or
iptables -A INPUT -i 10.10.10.200 -m multiport -m state --state NEW -p udp
-s 10.10.10.0/24 -d 10.10.10.0/24 --dports 5404,5405 -j ACCEPT

10.10.10.200 is the interface ip(your server ip)

10.10.10.0/24 is your network
if do not want to get involved in iptables, just do
service iptables stop
chkconfig iptables off

another way to see which port you are using now

nmap -sS -O localhost
or
netstat -an

it would be nice to know your os rhel5 or rhel4 or something else

[ 本帖最后由 gl00ad 于 2008-10-25 09:51 编辑 ]

论坛徽章:
0
5 [报告]
发表于 2008-10-25 20:12 |只看该作者

回复 #4 gl00ad 的帖子

多谢!
我的是RH5

[root@cms2 ~]# uname -an
Linux cms2 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:21 EST 2007 i686 i686 i386 GNU/Linux

我的iptables中已经打开了udp 5404/5405的。但是依旧报错,我把iptables禁用也是一样的情况。还有其他要注意的吗?

以下为配置:

[root@cms2 ~]# more /etc/sysconfig/iptables
# Generated by iptables-save v1.3.5 on Sat Oct 25 18:48:52 2008
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [3208937289:8709118251955]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p esp -j ACCEPT
-A RH-Firewall-1-INPUT -p ah -j ACCEPT
-A RH-Firewall-1-INPUT -d 224.0.0.251 -p udp -m udp --dport 5353 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 21 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 5404 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 5405 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 16851 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 21064 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41966 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41967 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41968 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41969 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50006 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50008 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50009 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 50007 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 161 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 8070 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 873 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Sat Oct 25 18:48:52 2008

论坛徽章:
0
6 [报告]
发表于 2008-10-25 23:02 |只看该作者

回复 #5 oioilu 的帖子

I do not believed it took so long
let us do this
chkconfig iptables off
service iptables stop
netstat -an|grep udp
cat /etc/cluster/cluster.conf

another thing to look at is multicast, openais using multicast, I am not sure what I am saying, I still think your problem should be easier ... never know

论坛徽章:
0
7 [报告]
发表于 2008-10-25 23:25 |只看该作者
需要确保你的集群所在网络环境中是否有可能产生冲突的地址。
这个冲突可能包括广播地址、浮动IP地址;

你需要提供/etc/hosts,/etc/cluster.conf以及ifconfig 输出。
另外最简单的测试方法可以将心跳线直连看cman是否能够启动,若能启动则证明网络中存在冲突IP无疑。

论坛徽章:
0
8 [报告]
发表于 2008-10-26 09:53 |只看该作者
我之前曾经测试过,只要设备重启(iptables开启且配置没有改变的情况下)cman可以运行。但是现在已经是生产环境了,不能重启了。

现在双机之间的心跳线是直连方式的。

从单播IP看,应该不存在重复。感觉应该是RHCS内部通讯用的组播地址可能会有问题。cms1运行都正常,但是本机cms2不能work。

[root@cms2 ~]# clustat
CMAN is not running.
[root@cms2 ~]# netstat -an|grep udp
udp        0      0 0.0.0.0:32769               0.0.0.0:*                              
udp        0      0 0.0.0.0:514                 0.0.0.0:*                              
udp        0      0 127.0.0.1:5405              0.0.0.0:*                              
udp        0      0 127.0.0.1:5149              0.0.0.0:*                              
udp        0      0 226.94.1.1:5405             0.0.0.0:*                              
udp        0      0 0.0.0.0:161                 0.0.0.0:*                              
udp        0      0 0.0.0.0:825                 0.0.0.0:*                              
udp        0      0 0.0.0.0:828                 0.0.0.0:*                              
udp        0      0 0.0.0.0:5353                0.0.0.0:*                              
udp        0      0 0.0.0.0:111                 0.0.0.0:*                              
udp        0      0 0.0.0.0:631                 0.0.0.0:*                              
udp        0      0 :::32770                    :::*                                    
udp        0      0 :::32771                    :::*                                    
udp        0      0 :::32772                    :::*                                    
udp        0      0 :::2463                     :::*                                    
udp        0      0 :::50007                    :::*                                    
udp        0      0 :::5353                     :::*                             

[root@cms2 ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="cms" config_version="4" name="cms">
        <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="cms1" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device lanplus="1" name="cms1-fence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="cms2" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device lanplus="1" name="cms2-fence"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_ipmilan" auth="password" ipaddr="192.1
68.201.11" login="a" name="cms1-fence" passwd="a"/>
                <fencedevice agent="fence_ipmilan" auth="password" ipaddr="192.1
68.201.12" login="a" name="cms2-fence" passwd="a"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
</cluster>



[root@cms2 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1       localhost.localdomain   localhost       cms2.Guangdong
::1     localhost6.localdomain6 localhost       cms2.Guangdong6
192.168.201.1 cms1
192.168.201.2 cms2




[root@cms2 ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:1E:4F:39:91:94  
          inet addr:IPA  Bcast:IPB  Mask:255.255.255.240
          inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:4177457574 errors:0 dropped:0 overruns:0 frame:0
          TX packets:608846683 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2005133999 (1.8 GiB)  TX bytes:1653861240 (1.5 GiB)

eth0      Link encap:Ethernet  HWaddr 00:1E:4F:39:91:92  
          inet addr:192.168.201.2  Bcast:192.168.201.255  Mask:255.255.255.0
          inet6 addr: fe80::21e:4fff:fe39:9192/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:34221328 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24361128 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1013490086 (966.5 MiB)  TX bytes:3409997433 (3.1 GiB)
          Interrupt:169 Memory:da000000-da012100

eth1      Link encap:Ethernet  HWaddr 00:1E:4F:39:91:94  
          inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:4177451384 errors:0 dropped:0 overruns:0 frame:0
          TX packets:608846666 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2003644680 (1.8 GiB)  TX bytes:1653856872 (1.5 GiB)
          Interrupt:169 Memory:d6000000-d6012100

eth2      Link encap:Ethernet  HWaddr 00:1E:4F:39:91:94  
          inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:6190 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1489319 (1.4 MiB)  TX bytes:4368 (4.2 KiB)
          Base address:0xdce0 Memory:d5ee0000-d5f00000

[ 本帖最后由 oioilu 于 2008-10-26 09:56 编辑 ]

论坛徽章:
0
9 [报告]
发表于 2008-10-26 10:30 |只看该作者

回复 #8 oioilu 的帖子

udp        0      0 127.0.0.1:5405              0.0.0.0:*                              
udp        0      0 127.0.0.1:5149              0.0.0.0:*                              
udp        0      0 226.94.1.1:5405             0.0.0.0:*

your 5405 port is being used already by ... 127.0.0.1:5405 and multicast 226.94.1.1:5405, you need to look into this issue.

run these and tell us the output:

lsof -i @127.0.0.1:5405
lsof -i @226.94.1.1:5405

there is a better way

lsof -i UDP:5405

[ 本帖最后由 gl00ad 于 2008-10-26 11:35 编辑 ]

论坛徽章:
0
10 [报告]
发表于 2008-10-26 18:55 |只看该作者

回复 #9 gl00ad 的帖子

Here is the output



  1. [root@cms2 ~]# lsof -i UDP:5405
  2. COMMAND  PID USER   FD   TYPE   DEVICE SIZE NODE NAME
  3. aisexec 9884 root    3u  IPv4 37348117       UDP 226.94.1.1:netsupport
  4. aisexec 9884 root    5u  IPv4 37348119       UDP localhost.localdomain:netsupport

复制代码


奇怪,难道aisexec不会随着service cman restart而自动重启吗?


  1. [root@cms2 ~]# ps -ef | grep aisexec
  2. root      9884     1  5 Jul31 ?        4-13:17:07 aisexec
  3. root     13708 13649  0 18:44 pts/1    00:00:00 grep aisexec
复制代码


我把9884 进程kill了,之后fence又出错。


  1. [root@cms2 ~]# service cman restart
  2. Stopping cluster:
  3.    Stopping fencing... done
  4.    Stopping cman... done
  5.    Stopping ccsd... done
  6.    Unmounting configfs... done
  7. [  OK  ]
  8. Starting cluster:
  9.    Loading modules... done
  10.    Mounting configfs... done
  11.    Starting ccsd... done
  12.    Starting cman... done
  13.    Starting daemons... done
  14.    Starting fencing... failed

  15. [FAILED]
复制代码



相关的syslog变了
以下是node2上的log

  1. Oct 26 18:45:05 cms2 ccsd[13898]: Starting ccsd 2.0.60:
  2. Oct 26 18:45:05 cms2 ccsd[13898]:  Built: Jan 23 2007 12:42:25
  3. Oct 26 18:45:05 cms2 ccsd[13898]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
  4. Oct 26 18:45:05 cms2 ccsd[13898]: cluster.conf (cluster name = cms, version = 4) found.
  5. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] AIS Executive Service RELEASE 'subrev 1324 version 0.80.2'
  6. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
  7. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Copyright (C) 2006 Red Hat, Inc.
  8. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] AIS Executive Service: started and ready to provide service.
  9. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Using default multicast address of 239.192.2.219
  10. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_cpg loaded.
  11. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais cluster closed process group service v1.01'
  12. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_cfg loaded.
  13. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais configuration service'
  14. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_msg loaded.
  15. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais message service B.01.01'
  16. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_lck loaded.
  17. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais distributed locking service B.01.01'
  18. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_evt loaded.
  19. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais event service B.01.01'
  20. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_ckpt loaded.
  21. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais checkpoint service B.01.01'
  22. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_amf loaded.
  23. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais availability management framework B.01.01'
  24. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_clm loaded.
  25. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais cluster membership service B.01.01'
  26. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_evs loaded.
  27. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais extended virtual synchrony service'
  28. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] openais component openais_cman loaded.
  29. Oct 26 18:45:06 cms2 openais[13904]: [MAIN ] Registering service handler 'openais CMAN membership service 2.01'
  30. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Token Timeout (10000 ms) retransmit timeout (495 ms)
  31. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] token hold (386 ms) retransmits before loss (20 retrans)
  32. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
  33. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs)
  34. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500
  35. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] window size per rotation (50 messages) maximum messages per rotation (17 messages)
  36. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] send threads (0 threads)
  37. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] RRP token expired timeout (495 ms)
  38. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] RRP token problem counter (2000 ms)
  39. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] RRP threshold (10 problem count)
  40. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] RRP mode set to none.
  41. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] heartbeat_failures_allowed (0)
  42. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] max_network_delay (50 ms)
  43. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
  44. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).
  45. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
  46. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] The network interface [192.168.201.2] is now up.
  47. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Created or loaded sequence id 0.192.168.201.2 for this ring.
  48. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering GATHER state from 15.
  49. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais extended virtual synchrony service'
  50. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais cluster membership service B.01.01'
  51. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais availability management framework B.01.01'
  52. Oct 26 18:45:07 cms2 ccsd[13898]: Initial status:: Quorate
  53. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais checkpoint service B.01.01'
  54. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais event service B.01.01'
  55. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais distributed locking service B.01.01'
  56. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais message service B.01.01'
  57. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais configuration service'
  58. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais cluster closed process group service v1.01'
  59. Oct 26 18:45:07 cms2 openais[13904]: [SERV ] Initialising service handler 'openais CMAN membership service 2.01'
  60. Oct 26 18:45:07 cms2 openais[13904]: [CMAN ] CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
  61. Oct 26 18:45:07 cms2 openais[13904]: [SYNC ] Not using a virtual synchrony filter.
  62. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Creating commit token because I am the rep.
  63. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Saving state aru 0 high seq received 0
  64. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering COMMIT state.
  65. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering RECOVERY state.
  66. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] position [0] member 192.168.201.2:
  67. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] previous ring seq 0 rep 192.168.201.2
  68. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] aru 0 high delivered 0 received flag 0
  69. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Did not need to originate any messages in recovery.
  70. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Storing new sequence id for ring 4
  71. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Sending initial ORF token
  72. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] CLM CONFIGURATION CHANGE
  73. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] New Configuration:
  74. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] Members Left:
  75. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] Members Joined:
  76. Oct 26 18:45:07 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service.
  77. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] CLM CONFIGURATION CHANGE
  78. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] New Configuration:
  79. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ]    r(0) ip(192.168.201.2)  
  80. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] Members Left:
  81. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] Members Joined:
  82. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ]    r(0) ip(192.168.201.2)  
  83. Oct 26 18:45:07 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service.
  84. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering OPERATIONAL state.
  85. Oct 26 18:45:07 cms2 openais[13904]: [CMAN ] quorum regained, resuming activity
  86. Oct 26 18:45:07 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.2
  87. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering GATHER state from 11.
  88. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] Saving state aru 9 high seq received 9
  89. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering COMMIT state.
  90. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] entering RECOVERY state.
  91. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] position [0] member 192.168.201.1:
  92. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] previous ring seq 116 rep 192.168.201.1
  93. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] aru c high delivered c received flag 0
  94. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] position [1] member 192.168.201.2:
  95. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] previous ring seq 4 rep 192.168.201.2
  96. Oct 26 18:45:07 cms2 openais[13904]: [TOTEM] aru 9 high delivered 9 received flag 0
  97. Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] Did not need to originate any messages in recovery.
  98. Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] Storing new sequence id for ring 78
  99. Oct 26 18:45:08 cms2 groupd[13912]: found uncontrolled kernel object rgmanager in /sys/kernel/dlm
  100. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] CLM CONFIGURATION CHANGE
  101. Oct 26 18:45:08 cms2 groupd[13912]: found uncontrolled kernel object clvmd in /sys/kernel/dlm
  102. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] New Configuration:
  103. Oct 26 18:45:08 cms2 groupd[13912]: local node must be reset to clear 2 uncontrolled instances of gfs and/or dlm
  104. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ]    r(0) ip(192.168.201.2)  
  105. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] Members Left:
  106. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] Members Joined:
  107. Oct 26 18:45:08 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service.
  108. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] CLM CONFIGURATION CHANGE
  109. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] New Configuration:
  110. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ]    r(0) ip(192.168.201.1)  
  111. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ]    r(0) ip(192.168.201.2)  
  112. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] Members Left:
  113. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] Members Joined:
  114. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ]    r(0) ip(192.168.201.1)  
  115. Oct 26 18:45:08 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service.
  116. Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] entering OPERATIONAL state.
  117. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.1
  118. Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.2
  119. Oct 26 18:45:08 cms2 openais[13904]: [CPG  ] got joinlist message from node 1
  120. [b][color=Red]Oct 26 18:45:08 cms2 openais[13904]: [CMAN ] cman killed by node 2 for reason 2
  121. Oct 26 18:45:08 cms2 dlm_controld[13924]: cluster is down, exiting[/color][/b]
  122. Oct 26 18:45:08 cms2 gfs_controld[13930]: cluster is down, exiting
  123. Oct 26 18:45:08 cms2 fenced[13918]: cluster is down, exiting
  124. Oct 26 18:45:08 cms2 kernel: dlm: closing connection to node 2
  125. Oct 26 18:45:08 cms2 kernel: dlm: closing connection to node 1
  126. Oct 26 18:45:35 cms2 ccsd[13898]: Unable to connect to cluster infrastructure after 30 seconds.

复制代码


以下是node1上的log


  1. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] entering GATHER state from 11.
  2. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] Creating commit token because I am the rep.
  3. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] Saving state aru c high seq received c
  4. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] entering COMMIT state.
  5. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] entering RECOVERY state.
  6. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] position [0] member 192.168.201.1:
  7. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] previous ring seq 124 rep 192.168.201.1
  8. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] aru c high delivered c received flag 0
  9. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] position [1] member 192.168.201.2:
  10. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] previous ring seq 4 rep 192.168.201.2
  11. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] aru 9 high delivered 9 received flag 0
  12. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] Did not need to originate any messages in recovery.
  13. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] Storing new sequence id for ring 80
  14. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] Sending initial ORF token
  15. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] CLM CONFIGURATION CHANGE
  16. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] New Configuration:
  17. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.1)  
  18. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] Members Left:
  19. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] Members Joined:
  20. Oct 26 18:56:03 cms1 openais[3369]: [SYNC ] This node is within the primary component and will provide service.
  21. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] CLM CONFIGURATION CHANGE
  22. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] New Configuration:
  23. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.1)  
  24. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.2)  
  25. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] Members Left:
  26. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] Members Joined:
  27. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.2)  
  28. Oct 26 18:56:03 cms1 openais[3369]: [SYNC ] This node is within the primary component and will provide service.
  29. Oct 26 18:56:03 cms1 openais[3369]: [TOTEM] entering OPERATIONAL state.
  30. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] got nodejoin message 192.168.201.1
  31. Oct 26 18:56:03 cms1 openais[3369]: [CLM  ] got nodejoin message 192.168.201.2
  32. Oct 26 18:56:03 cms1 openais[3369]: [CPG  ] got joinlist message from node 1
  33. Oct 26 18:56:14 cms1 openais[3369]: [TOTEM] The token was lost in the OPERATIONAL state.
  34. Oct 26 18:56:14 cms1 openais[3369]: [TOTEM] Receive multicast socket recv buffer size (288000 bytes).
  35. Oct 26 18:56:14 cms1 openais[3369]: [TOTEM] Transmit multicast socket send buffer size (288000 bytes).
  36. Oct 26 18:56:14 cms1 openais[3369]: [TOTEM] entering GATHER state from 2.
  37. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] entering GATHER state from 0.
  38. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] Creating commit token because I am the rep.
  39. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] Saving state aru 17 high seq received 17
  40. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] entering COMMIT state.
  41. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] entering RECOVERY state.
  42. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] position [0] member 192.168.201.1:
  43. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] previous ring seq 128 rep 192.168.201.1
  44. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] aru 17 high delivered 17 received flag 0
  45. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] Did not need to originate any messages in recovery.
  46. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] Storing new sequence id for ring 84
  47. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] Sending initial ORF token
  48. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] CLM CONFIGURATION CHANGE
  49. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] New Configuration:
  50. Oct 26 18:56:19 cms1 kernel: dlm: closing connection to node 2
  51. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.1)  
  52. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] Members Left:
  53. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.2)  
  54. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] Members Joined:
  55. Oct 26 18:56:19 cms1 openais[3369]: [SYNC ] This node is within the primary component and will provide service.
  56. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] CLM CONFIGURATION CHANGE
  57. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] New Configuration:
  58. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ]     r(0) ip(192.168.201.1)  
  59. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] Members Left:
  60. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] Members Joined:
  61. Oct 26 18:56:19 cms1 openais[3369]: [SYNC ] This node is within the primary component and will provide service.
  62. Oct 26 18:56:19 cms1 openais[3369]: [TOTEM] entering OPERATIONAL state.
  63. Oct 26 18:56:19 cms1 openais[3369]: [CLM  ] got nodejoin message 192.168.201.1
  64. Oct 26 18:56:19 cms1 openais[3369]: [CPG  ] got joinlist message from node 1
复制代码

[ 本帖最后由 oioilu 于 2008-10-26 19:09 编辑 ]
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP