- 论坛徽章:
- 0
|
下面的内容针对heartbeat新手,很多的操作是putty上的直接复制
Heartbeat 2.x 功能
2.x和1.x最主要的区别在于,
1) 2.x支持CRM管理,资源文件由原来的haresources变为cib.xml,
2) 支持OCF格式的resource agent,
3) 可以对多资源组进行独立监控
4)支持多节点
一、下载软件包
下载libnet
http://www.packetfactory.net/libnet/
下载heartbeat
http://www.linux-ha.org/DownloadSoftware
http://hg.linux-ha.org/lha-2.1/archive/STABLE-2.1.4.tar.bz2
二、编译安装软件包
tar -xvf libnet-1.1.3-RC-01.tar.gz(此版本有问题,使用1.2版本)
cd libnet
./configure ; make ; make install
cd ..
tar -xvf Heartbeat-STABLE-2-1-STABLE-2.1.4.tar.bz2
cd Heartbeat-STABLE-2-1-STABLE-2.1.4
./ConfigureMe configure ; make ; make install
在包含自 /usr/include/libnet.h:120 的文件中,
从send_arp.c:33:
/usr/include/./libnet/libnet-types.h:36:23: 错误:../config.h:没有那个文件或者目录
gmake[2]: *** [send_arp.o] 错误 1
gmake[2]: Leaving directory `/w/Heartbeat-STABLE-2-1-STABLE-2.1.4/heartbeat/libnet_util'
gmake[1]: *** [install-recursive] 错误 1
gmake[1]: Leaving directory `/w/Heartbeat-STABLE-2-1-STABLE-2.1.4/heartbeat'
make: *** [install-recursive] 错误 1
更换libnet1.2的稳定版本
tar -xvf libnet.tar.gz
执行相同的编译没有错误
三、编辑配置文件
groupadd -g 694 haclient
useradd -u 694 -g haclient hacluster
这个操作是编译安装必须的过程
/usr/share/doc/heartbeat-2.1.4此目录中放置的是配置文件的示例- httpd is an unknown Resource Agent. Please refer to http://www.linux-ha.org/ResourceAgent
- 说明没有httpd服务,安装上就可以了
- rpm -ivh httpd-2.2.3-11.el5_1.3.x86_64.rpm
- rs2 heartbeat: [9834]: EMERG: Rebooting system. Reason: /usr/lib64/heartbeat/cib
复制代码 如果没有上述账户将会出现上面的错误-
- rm -rf /var/run/heartbeat
- 修改heartbeat目录权限,可以用以下命令:
- find / -type d -name "heartbeat" -exec chown -R hacluster {} \;
- find / -type d -name "heartbeat" -exec chgrp -R haclient {} \;
- chown -R hacluster /usr/lib/heartbeat
- chown -R hacluster /usr/lib/ocf/resource.d/heartbeat
- chown -R hacluster /usr/include/heartbeat
- chown -R hacluster /usr/share/heartbeat
- chown -R hacluster /var/lib/heartbeat
- chown -R hacluster /var/run/heartbeat
- chgrp -R haclient /usr/lib/heartbeat
- chgrp -R haclient /usr/lib/ocf/resource.d/heartbeat
- chgrp -R haclient /usr/include/heartbeat
- chgrp -R haclient /usr/share/heartbeat
- chgrp -R haclient /var/lib/heartbeat
- chgrp -R haclient /var/run/heartbeat
复制代码 建立账户就是解决这个问题的。- # cat authkeys
- auth 1
- 1 crc
- #chmod 600 authkeys
复制代码 这个步骤是不能缺少的-
- cat ha.cf
- debugfile /var/log/ha-debug
- logfile /var/log/ha-log
- logfacility local0
- keepalive 2
- deadtime 10
- warntime 5
- initdead 60
- udpport 694
- #baud 19200
- #serial /dev/ttyS0 # Linux
- #bcast eth1
- mcast eth0 225.0.0.1 694 1 0
- #ucast eth1 172.16.1.134
- auto_failback on
- watchdog /dev/watchdog
- crm yes
- node rs11
- node rs12
- ping 192.168.1.141
- ping_group group1 10.10.10.254 10.10.10.253
- respawn hacluster /usr/lib/heartbeat/ipfail
- #respawn hacluster /usr/lib/heartbeat/pingd
- #apiauth ping gid=haclient uid=hacluster
- apiauth ipfail gid=haclient uid=hacluster
- respawn hacluster /usr/lib/heartbeat/cibmon -d
- apiauth cibmon uid=hacluster
复制代码- #cat bak.haresources
- rs11 192.168.1.145 httpd
- 这个文件原名为haresources在1.x上使用,不过为了区别使用此名称
- 如不使用此配置方式将出现资源无法启动的问题;类似 rs11 192.168.1.145/24/eth0 httpd
- 生成
复制代码 附上一段配置文件cib.xml的生成方式/lib64标识x86_64的方式-
- rm -rf /var/lib/heartbeat/crm/cib.xml*
- /usr/lib64/heartbeat/haresources2cib.py haresources
- rm -rf /var/lib/heartbeat/crm/cib.xml*
- /usr/lib64/heartbeat/haresources2cib.py -stout -c /etc/ha.d/ha.cf /etc/ha.d/haresources
- rm -rf /var/lib/heartbeat/crm/cib.xml*
- /usr/lib64/heartbeat/haresources2cib.py -stout -c /etc/ha.d/ha.cf /etc/ha.d/bak.haresources
- rm -rf /var/lib/heartbeat/crm/cib.xml*
- /usr/lib/heartbeat/haresources2cib.py -stout -c /etc/ha.d/ha.cf /etc/ha.d/bak.haresources
- scp /var/lib/heartbeat/crm/cib.xml 192.168.1.147:/var/lib/heartbeat/crm/cib.xml
- scp ha.cf root@192.168.1.144:/etc/ha.d/ha.cf
- scp /var/lib/heartbeat/crm/cib.xml root@192.168.1.144:/var/lib/heartbeat/crm/cib.xml
复制代码-
- cat /var/lib/heartbeat/crm/cib.xml
- <resources>
- <group id="group_1">
- <primitive class="ocf" id="IPaddr_192_168_1_145" provider="heartbeat" type="IPaddr">
- <operations>
- <op id="IPaddr_192_168_1_145_mon" interval="5s" name="monitor" timeout="5s"/>
- </operations>
- <instance_attributes id="IPaddr_192_168_1_145_inst_attr">
- <attributes>
- <nvpair id="IPaddr_192_168_1_145_attr_0" name="ip" value="192.168.1.145"/>
- </attributes>
- </instance_attributes>
- </primitive>
- <primitive class="lsb" id="httpd_2" provider="heartbeat" type="httpd">
- <operations>
- <op id="httpd_2_mon" interval="120s" name="monitor" timeout="60s"/>
- </operations>
- </primitive>
- </group>
- </resources>
- 将上面的httpd_2部分修改说明
- interval="20s"
- timeout="10s"
- 即每20秒检测资源运行情况,如果发现资源不在,则尝试启动资源,如果10s后还未启动成功,则资源切换向另节点,上述的数值可以缩减的更小,否则默认的2分钟会给人一种服务down没有重启或者切换的感觉。
复制代码 注意:IPaddr使用的是ocf格式的控制脚本,路径为:/usr/lib/ocf/resource.d/heartbeat/IPaddr
如果是在/etc/init.d/下的脚步都是lsb的脚步,这个通过crm_mon是可以看到 ;crm_resource -L也是有相同的显示;
介绍一下ocf和lsb格式的区别:
LSB格式的脚本必须支持status功能,必须能接收start,stop,status,三个参数;而如果是OCF格式,则必须支持start,stop,monitor三个参数.其中status和monitor参数是用来监控资源的,非常重要.
例如LSB风格的脚本,运行./Mysql status时候,
返回值包含OK或则running则表示资源正常
返回值包含stopped或者No则表示资源不正常。
假如是OCF风格的脚本,运行./Mysql monitor时候,
返回0表示资源是正常的,
返回7表示资源出现问题.
ocf格式的启动脚本在/usr/lib/ocf/resource.d/heartbeat
lsb的脚步一般在/etc/init.d/下面
启动脚步后
通过crm_mon 可以查看资源状态;可以通过参数指定刷新时间crm_mon -i1,标识每秒刷新;-
- #crm_mon -i1
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 00:57:44 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): online
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs11
- httpd_2 (lsb:httpd): Started rs11
复制代码 查看所有资源
#crm_resource -L
Resource Group: group_1
IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr)
httpd_2 (lsb:httpd)
查看资源在那个节点上运行
#crm_resource -W -r httpd_2
resource httpd_2 is running on: rs11
查看资源在cib.xml中的定义- # crm_resource -x -r httpd_2
- httpd_2 (lsb:httpd): Started rs11
- raw xml:
- <primitive class="lsb" provider="heartbeat" type="httpd" id="httpd_2">
- <operations>
- <op id="httpd_2_mon" interval="120s" name="monitor" timeout="60s"/>
- </operations>
- <instance_attributes id="httpd_2">
- <attributes>
- <nvpair name="target_role" id="httpd_2-target_role" value="started"/>
- </attributes>
- </instance_attributes>
- </primitive>
复制代码 启动/停止资源
#crm_resource -r httpd_2 -p target_role -v started
#crm_resource -r httpd_2 -p target_role -v stopped
将资源从当前节点转移到另个节点
#crm_resource -M -r httpd_2
将资源转移到指定节点
#crm_resource -M -r httpd_2 -H rs12
允许资源回到正常的节点
#crm_resource -U -r httpd_2
将资源从CRM中删除
#crm_resource -D -r httpd_2 -t primitive
将资源组从CRM中删除
#crm_resource -D -r My-DRBD-group -t group
将资源从CRM中禁用
#crm_resource -p is_managed -r httpd_2 -t primitive -v off
将资源从新从CRM中启用
#crm_resource -p is_managed -r httpd_2 -t primitive -v on
重启资源
#crm_resource -C -H rs12 -r httpd_2
检查所有节点上未在CRM中的资源
#crm_resource -P
检查指定节点上未在CRM中的资源
#crm_resource -P -H rs12
检查所有节点上未在CRM中的资源
#crm_resource -P
检查指定节点上未在CRM中的资源
#crm_resource -P -H rs12
设置资源的某个属性
#crm_resource -r httpd_2 -p email -v "lvsheat@qq.com"-
- 网络切换测试
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:11:16 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): OFFLINE
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Stopped==> Started rs12
- httpd_2 (lsb:httpd): Stopped==> Started rs12
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:11:42 2008
- Current DC: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): OFFLINE
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): online
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs11
- httpd_2 (lsb:httpd): Started rs11
复制代码 服务切换,需要设置cib.xml中的数值-
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:12:25 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): online
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs11
- httpd_2 (lsb:httpd): Started rs11 FAILED
- Failed actions:
- httpd_2_monitor_5000 (node=rs11, call=13, rc=7): complete
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:12:35 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): online
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs11
- httpd_2 (lsb:httpd): Stopped
- Failed actions:
- httpd_2_monitor_5000 (node=rs11, call=16, rc=7): complete
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:12:38 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): online
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs11
- httpd_2 (lsb:httpd): Started rs11
复制代码-
- 宕机切换
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:13:23 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): online
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs11
- httpd_2 (lsb:httpd): Stopped
- Failed actions:
- httpd_2_monitor_5000 (node=rs11, call=19, rc=7): complete
- Refresh in 1s...
- ============
- Last updated: Wed Dec 10 01:13:41 2008
- Current DC: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: rs12 (041ec13a-a975-4813-ae93-56ca0dfc89b0): online
- Node: rs11 (6bccf2ee-a1b7-40d2-a875-729149e372ea): OFFLINE
- Resource Group: group_1
- IPaddr_192_168_1_145 (ocf::heartbeat:IPaddr): Started rs12
- httpd_2 (lsb:httpd): Started rs12
复制代码
mv httpd.conf httpd.conf.1
killall httpd
[ 本帖最后由 kns1024wh 于 2008-12-30 11:07 编辑 ] |
|