- 论坛徽章:
- 0
|
两台Red Hat Enterprise Linux Server release 5 (Tikanga)双机,采用Linux自带集群软件,用crm进行监控,版本为heartbeat-2.1.4-4.1。现在使用crm_mon命令时有报错
Failed actions:
httpd_2_start_0 (node=sso1.example, call=10, rc=1): Error
httpd_2_start_0 (node=sso2.example, call=7, rc=1): Error
[root@sso1 log]# clustat -l
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
sso1.example 1 Online, Local, rgmanager
sso2.example 2 Online, rgmanager
Service Information
------- -----------
Service Name : service:ssosvc
Current State : started (112)
Owner : sso2.example
Last Owner : sso1.example
Last Transition : Wed Aug 25 00:05:22 2010
使用clustat -l为正常的
[root@sso1 log]# ps -aux|grep heartbeat
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ
root 17995 0.0 0.3 12192 12192 ? SLs Jul09 0:34 heartbeat: master control process
nobody 17998 0.0 0.1 5596 5596 ? SL Jul09 0:00 heartbeat: FIFO reader
nobody 17999 0.0 0.1 5592 5592 ? SL Jul09 0:01 heartbeat: write: bcast eth2
nobody 18000 0.0 0.1 5592 5592 ? SL Jul09 0:00 heartbeat: read: bcast eth2
nobody 18001 0.0 0.1 5592 5592 ? SL Jul09 0:01 heartbeat: write: ucast eth2
nobody 18002 0.0 0.1 5592 5592 ? SL Jul09 0:01 heartbeat: read: ucast eth2
nobody 18003 0.0 0.1 5592 5592 ? SL Jul09 0:01 heartbeat: write: ping 192.100.2.62
nobody 18004 0.0 0.1 5592 5592 ? SL Jul09 0:02 heartbeat: read: ping 192.100.2.62
24 18190 0.0 0.0 4896 1536 ? S Jul09 0:00 /usr/lib/heartbeat/ccm
24 18191 0.0 0.0 6432 2568 ? S Jul09 0:00 /usr/lib/heartbeat/cib
root 18192 0.0 0.0 4888 1840 ? S Jul09 0:00 /usr/lib/heartbeat/lrmd -r
nobody 18193 0.0 0.1 4604 4604 ? SL Jul09 0:00 /usr/lib/heartbeat/stonithd
24 18194 0.0 0.0 4568 1384 ? S Jul09 0:00 /usr/lib/heartbeat/attrd
24 18195 0.0 0.0 5532 2584 ? S Jul09 0:00 /usr/lib/heartbeat/crmd
24 18368 0.0 0.0 5044 1868 ? S Jul09 0:00 /usr/lib/heartbeat/tengine
24 18369 0.0 0.0 6000 2300 ? S Jul09 0:00 /usr/lib/heartbeat/pengine
root 19193 0.0 0.0 3896 680 pts/1 S+ 09:39 0:00 grep heartbeatheartbeat
也都正常运行
[root@sso1 log]# crm_mon
Refresh in 14s...
============
Last updated: Tue Aug 31 09:39:32 2010
Current DC: sso1.example (81a9a736-93b6-44d8-9e29-7c0e0237c541)
2 Nodes configured.
1 Resources configured.
============
Node: sso2.example (73e15517-2a8b-4289-8512-0c5f8523eab3): online
Node: sso1.example (81a9a736-93b6-44d8-9e29-7c0e0237c541): online
Failed actions:
httpd_2_start_0 (node=sso1.example, call=10, rc=1): Error
httpd_2_start_0 (node=sso2.example, call=7, rc=1): Error
但使用crm_mon命令时有报错
以下为配置文件
[root@sso1 log]# more /var/lib/heartbeat/crm/cib.xml
<cib admin_epoch="0" epoch="0" num_updates="0" generated="false" have_quorum="true" ignore_dtd="fals
e" num_peers="0" cib-last-written="Fri Jul 9 11:20:15 2010" ccm_transition="1">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<attributes>
<nvpair id="cib-bootstrap-options-symmetric-cluster" name="symmetric-cluster" value="true"
/>
<nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="stop"/>
<nvpair id="cib-bootstrap-options-default-resource-stickiness" name="default-resource-stic
kiness" value="0"/>
<nvpair id="cib-bootstrap-options-default-resource-failure-stickiness" name="default-resou
rce-failure-stickiness" value="0"/>
<nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
<nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/>
<nvpair id="cib-bootstrap-options-stop-orphan-resources" name="stop-orphan-resources" valu
e="true"/>
<nvpair id="cib-bootstrap-options-stop-orphan-actions" name="stop-orphan-actions" value="t
rue"/>
<nvpair id="cib-bootstrap-options-remove-after-stop" name="remove-after-stop" value="false
"/>
<nvpair id="cib-bootstrap-options-short-resource-names" name="short-resource-names" value=
"true"/>
<nvpair id="cib-bootstrap-options-transition-idle-timeout" name="transition-idle-timeout"
value="5min"/>
<nvpair id="cib-bootstrap-options-default-action-timeout" name="default-action-timeout" va
lue="15s"/>
<nvpair id="cib-bootstrap-options-is-managed-default" name="is-managed-default" value="tru
e"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node id="73e15517-2a8b-4289-8512-0c5f8523eab3" uname="sso2.example" type="normal"/>
<node id="81a9a736-93b6-44d8-9e29-7c0e0237c541" uname="sso1.example" type="normal"/>
</nodes>
<resources>
<group id="group_1">
<primitive class="ocf" id="IPaddr_192_100_2_7" provider="heartbeat" type="IPaddr">
<operations>
<op id="IPaddr_192_100_2_7_mon" interval="5s" name="monitor" timeout="5s"/>
</operations>
<instance_attributes id="IPaddr_192_100_2_7_inst_attr">
<attributes>
<nvpair id="IPaddr_192_100_2_7_attr_0" name="ip" value="192.100.2.7"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="lsb" id="httpd_2" provider="heartbeat" type="httpd">
<operations>
<op id="httpd_2_mon" interval="10s" name="monitor" timeout="10s"/>
</operations>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="rsc_location_group_1" rsc="group_1">
<rule id="prefered_location_group_1" score="100">
<expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="s
so1.example"/>
</rule>
</rsc_location>
</constraints>
</configuration>
</cib>
以下为截取的ha-log的部分日志
heartbeat[17995]: 2010/08/30_11:13:48 info: Current arena value: 0
heartbeat[17995]: 2010/08/30_11:13:48 info: MSG stats: 0/0 ms age 80180874 [pid18001/HBWRITE]
heartbeat[17995]: 2010/08/30_11:13:48 info: cl_malloc stats: 415/2829873 40524/19405 [pid18001/HBWRI
TE]
heartbeat[17995]: 2010/08/30_11:13:48 info: RealMalloc stats: 49408 total malloc bytes. pid [18001/HB
WRITE]
heartbeat[17995]: 2010/08/30_11:13:48 info: Current arena value: 0
heartbeat[17995]: 2010/08/30_11:13:48 info: MSG stats: 0/0 ms age 80180874 [pid18002/HBREAD]
heartbeat[17995]: 2010/08/30_11:13:48 info: cl_malloc stats: 416/16173445 40608/19449 [pid18002/HBRE
AD]
heartbeat[17995]: 2010/08/30_11:13:48 info: RealMalloc stats: 49076 total malloc bytes. pid [18002/HB
READ]
heartbeat[17995]: 2010/08/30_11:13:48 info: Current arena value: 0
heartbeat[17995]: 2010/08/30_11:13:48 info: MSG stats: 0/4942024 ms age 10 [pid18003/HBWRITE]
heartbeat[17995]: 2010/08/30_11:13:48 info: cl_malloc stats: 417/126832615 40692/19493 [pid18003/HBW
RITE]
heartbeat[17995]: 2010/08/30_11:13:48 info: RealMalloc stats: 62052 total malloc bytes. pid [18003/HB
WRITE]
heartbeat[17995]: 2010/08/30_11:13:48 info: Current arena value: 0
heartbeat[17995]: 2010/08/30_11:13:48 info: MSG stats: 0/2246339 ms age 10 [pid18004/HBREAD]
heartbeat[17995]: 2010/08/30_11:13:48 info: cl_malloc stats: 418/44927269 40776/19537 [pid18004/HBRE
AD]
heartbeat[17995]: 2010/08/30_11:13:48 info: RealMalloc stats: 42192 total malloc bytes. pid [18004/HB
READ]
heartbeat[17995]: 2010/08/30_11:13:48 info: Current arena value: 0
heartbeat[17995]: 2010/08/30_11:13:48 info: These are nothing to worry about.
cib[18191]: 2010/08/30_16:15:57 info: cib_stats: Processed 2 operations (0.00us average, 0% utilizati
on) in the last 10min
cib[18191]: 2010/08/31_08:35:54 info: cib_stats: Processed 1 operations (0.00us average, 0% utilizati
on) in the last 10min
cib[18191]: 2010/08/31_08:45:54 info: cib_stats: Processed 15 operations (666.00us average, 0% utiliz
ation) in the last 10min
cib[18191]: 2010/08/31_08:55:54 info: cib_stats: Processed 40 operations (0.00us average, 0% utilizat
ion) in the last 10min
cib[18191]: 2010/08/31_09:05:54 info: cib_stats: Processed 35 operations (0.00us average, 0% utilizat
ion) in the last 10min
希望各路大侠指点下 |
|