免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 4146 | 回复: 6
打印 上一主题 下一主题

关于heartbeat的邮件通知问题 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2009-03-26 13:01 |只看该作者 |倒序浏览
我配置好后可以收到邮件通知,但是没有标题,内容。

顺便问下heartbeat能不能通知自已获得的一些信息呢? 比如另一个节点死掉。


[root@f801 ~]# cat /var/lib/heartbeat/crm/cib.xml
<cib admin_epoch="0" epoch="2" num_updates="1" generated="false" have_quorum="false" ignore_dtd="false" num_peers="0" cib_feature_revision="2.0" cib-last-written="Thu Mar 26 12:12:55 2009">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="cib-bootstrap-options-symmetric-cluster" name="symmetric-cluster" value="true"/>
           <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="stop"/>
           <nvpair id="cib-bootstrap-options-default-resource-stickiness" name="default-resource-stickiness" value="0"/>
           <nvpair id="cib-bootstrap-options-default-resource-failure-stickiness" name="default-resource-failure-stickiness" value="0"/>
           <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
           <nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/>
           <nvpair id="cib-bootstrap-options-startup-fencing" name="startup-fencing" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-resources" name="stop-orphan-resources" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-actions" name="stop-orphan-actions" value="true"/>
           <nvpair id="cib-bootstrap-options-remove-after-stop" name="remove-after-stop" value="false"/>
           <nvpair id="cib-bootstrap-options-short-resource-names" name="short-resource-names" value="true"/>
           <nvpair id="cib-bootstrap-options-transition-idle-timeout" name="transition-idle-timeout" value="5min"/>
           <nvpair id="cib-bootstrap-options-default-action-timeout" name="default-action-timeout" value="20s"/>
           <nvpair id="cib-bootstrap-options-is-managed-default" name="is-managed-default" value="true"/>
           <nvpair id="cib-bootstrap-options-cluster-delay" name="cluster-delay" value="60s"/>
           <nvpair id="cib-bootstrap-options-pe-error-series-max" name="pe-error-series-max" value="-1"/>
           <nvpair id="cib-bootstrap-options-pe-warn-series-max" name="pe-warn-series-max" value="-1"/>
           <nvpair id="cib-bootstrap-options-pe-input-series-max" name="pe-input-series-max" value="-1"/>
           <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node id="6e49df15-d383-4926-8aa4-030556557a53" uname="f801.haijiye" type="normal"/>
       <node id="fac48e02-49f9-45af-b8fe-81a797f01586" uname="f802.haijiye" type="normal"/>
     </nodes>
     <resources>
       <group id="group_1">
         <primitive class="ocf" id="IPaddr_192_168_1_118" provider="heartbeat" type="IPaddr">
           <operations>
             <op id="IPaddr_192_168_1_118_mon" interval="5s" name="monitor" timeout="5s"/>
           </operations>
           <instance_attributes id="IPaddr_192_168_1_118_inst_attr">
             <attributes>
               <nvpair id="IPaddr_192_168_1_118_attr_0" name="ip" value="192.168.1.118"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="heartbeat" id="httpd_2" provider="heartbeat" type="httpd">
           <operations>
             <op id="httpd_2_mon" interval="120s" name="monitor" timeout="60s"/>
           </operations>
         </primitive>
         <primitive class="ocf" id="MailTo_3" provider="heartbeat" type="MailTo">
           <operations>
             <op id="MailTo_3_mon" interval="120s" name="monitor" timeout="60s"/>
           </operations>
           <instance_attributes id="MailTo_3_inst_attr">
             <attributes>
               <nvpair id="MailTo_3_attr_0" name="email" value="cocobear@yeah.net"/>
               <nvpair id="MailTo_3_attr_1" name="subject" value="Status-of-httpd-changed"/>
             </attributes>
           </instance_attributes>
         </primitive>
       </group>
     </resources>
     <constraints>
       <rsc_location id="rsc_location_group_1" rsc="group_1">
         <rule id="prefered_location_group_1" score="100">
           <expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="f801.haijiye"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
</cib>

论坛徽章:
0
2 [报告]
发表于 2009-03-26 13:19 |只看该作者

回复 #1 可可熊 的帖子

Querying a parameter of a resource. Say the resource is the following:
<primitive id="example_mail" class="ocf" type="MailTo" provider="heartbeat">
   <instance_attributes id="example_mail_inst">
<attributes>
   <nvpair id="example_mail_inst_attr0" name="email" value="root"/>
   <nvpair id="example_mail_inst_attr1" name="subject" value="Example Failover"/>
</attributes>
   </instance_attributes>
</primitive>

You could query the email address using the following:
   crm_resource -r example_mail -g email
可以测试
crm_resource -r example_mail -p email -v "abc@abc.com"

[ 本帖最后由 kns1024wh 于 2009-3-26 15:08 编辑 ]

论坛徽章:
0
3 [报告]
发表于 2009-03-26 13:58 |只看该作者
我是使用V2的heartbeat,使用了CRM模块,我是想在其中一个节点出问题的时候能收到邮件通知,我测试的时候主动关闭heartbeat:

service heartbeat stop

可以收到邮件,不过邮件是空的,不知道是怎么回事。

不过我希望能在从机出现问题的时候主机也能发送邮件通知,这点好像做不到,我试着重启从机了,没有收到任何邮件。

论坛徽章:
0
4 [报告]
发表于 2009-03-26 14:00 |只看该作者
原帖由 kns1024wh 于 2009-3-26 13:19 发表
   

[ 本帖最后由 kns1024wh 于 2009-3-26 16:04 编辑 ]

论坛徽章:
0
5 [报告]
发表于 2009-03-26 14:07 |只看该作者
比如,从机关掉后,主机的heartbeat会写日志:

crmd[1681]: 2009/03/26_14:05:25 notice: crmd_ha_status_callback: Status update: Node f802.haijiye now has status [dead]

能不能在这种情况下进行mail通知呢?

论坛徽章:
0
6 [报告]
发表于 2009-03-26 15:50 |只看该作者
[/quote]
<primitive id="resource_" class="ocf" type="MailTo" provider="heartbeat">
         <meta_attributes id="resource__meta_attrs">
           <attributes>
             <nvpair id="resource__metaattr_target_role" name="target_role" value="started"/>
           </attributes>
         </meta_attributes>
         <instance_attributes id="resource__instance_attrs">
           <attributes>
             <nvpair id="0af557aa-018e-4fa5-9b48-953f2d33750a" name="email" value="lvsheat@qq.com"/>
             <nvpair id="f68a1346-2bd3-42ae-8b5a-3c96bbef8910" name="subject" value="test001"/>
           </attributes>
         </instance_attributes>
       </primitive>

test001 Takeover in progress at Thu Mar 26 15:44:49 CST 2009 onxxxxx
test001 Migrating resource away at Thu Mar 26 15:52:44 CST 2009 fromxxxxx

[ 本帖最后由 kns1024wh 于 2009-3-26 16:05 编辑 ]

论坛徽章:
0
7 [报告]
发表于 2009-03-26 16:13 |只看该作者

回复 #6 kns1024wh 的帖子

我现在是一台机器上可以发出带内容的邮件,另一台发出去的全是空白。

f801是那台发出去空白的服务器,我重启了f802,f801接管的时候发出去的是空白

ib[1562]: 2009/03/26_16:02:44 info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm
cib[1562]: 2009/03/26_16:02:44 info: mem_handle_event: no mbr_track info
cib[1562]: 2009/03/26_16:02:44 info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm
cib[1562]: 2009/03/26_16:02:44 info: mem_handle_event: instance=9, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3
cib[1562]: 2009/03/26_16:02:44 info: cib_ccm_msg_callback: LOST: f802.haijiye
cib[1562]: 2009/03/26_16:02:44 info: cib_ccm_msg_callback: PEER: f801.haijiye
cib[1562]: 2009/03/26_16:02:44 info: cib_process_readwrite: We are now in R/W mode
pengine[1976]: 2009/03/26_16:02:44 info: G_main_add_SignalHandler: Added signal handler for signal 15
pengine[1976]: 2009/03/26_16:02:44 info: pe_init: Starting pengine
crmd[1566]: 2009/03/26_16:02:44 info: join_make_offer: Making join offers based on membership 9
crmd[1566]: 2009/03/26_16:02:44 info: do_dc_join_offer_all: join-1: Waiting on 1 outstanding join acks
tengine[1975]: 2009/03/26_16:02:44 info: G_main_add_SignalHandler: Added signal handler for signal 15
tengine[1975]: 2009/03/26_16:02:44 info: G_main_add_TriggerHandler: Added signal manual handler
tengine[1975]: 2009/03/26_16:02:44 info: G_main_add_TriggerHandler: Added signal manual handler
tengine[1975]: 2009/03/26_16:02:44 info: te_init: Registering TE UUID: 112c0c00-03fb-4f6a-be94-4cf8d83f5cc9
tengine[1975]: 2009/03/26_16:02:44 info: set_graph_functions: Setting custom graph functions
tengine[1975]: 2009/03/26_16:02:44 info: unpack_graph: Unpacked transition -1: 0 actions in 0 synapses
tengine[1975]: 2009/03/26_16:02:44 info: te_init: Starting tengine
tengine[1975]: 2009/03/26_16:02:44 info: te_connect_stonith: Attempting connection to fencing daemon...
cib[1562]: 2009/03/26_16:02:44 info: cib_null_callback: Setting cib_diff_notify callbacks for tengine: on
crmd[1566]: 2009/03/26_16:02:44 info: update_dc: Set DC to f801.haijiye (2.0)
crmd[1566]: 2009/03/26_16:02:45 info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]
crmd[1566]: 2009/03/26_16:02:45 info: do_state_transition: All 1 cluster nodes responded to the join offer.
cib[1562]: 2009/03/26_16:02:45 info: sync_our_cib: Syncing CIB to all peers
crmd[1566]: 2009/03/26_16:02:45 info: update_attrd: Connecting to attrd...
attrd[1565]: 2009/03/26_16:02:45 info: attrd_local_callback: Sending full refresh
tengine[1975]: 2009/03/26_16:02:45 info: te_connect_stonith: Connected
crmd[1566]: 2009/03/26_16:02:45 info: update_dc: Set DC to f801.haijiye (2.0)
heartbeat[1542]: 2009/03/26_16:02:45 WARN: glib: TTY write timeout on [/dev/ttyS0] (no connection or bad cable? [see documentation])
heartbeat[1542]: 2009/03/26_16:02:45 info: glib: See http://linux-ha.org/FAQ#TTYtimeout for details
crmd[1566]: 2009/03/26_16:02:45 info: do_dc_join_ack: join-1: Updating node state to member for f801.haijiye
tengine[1975]: 2009/03/26_16:02:45 info: process_graph_event: Action IPaddr_192_168_1_118_monitor_0 initiated by a different transitioner
tengine[1975]: 2009/03/26_16:02:45 info: update_abort_priority: Abort priority upgraded to 1000000
tengine[1975]: 2009/03/26_16:02:45 info: update_abort_priority: 'DC Takeover' abort superceeded
tengine[1975]: 2009/03/26_16:02:45 info: process_graph_event: Action httpd_2_monitor_0 initiated by a different transitioner
tengine[1975]: 2009/03/26_16:02:45 info: process_graph_event: Action MailTo_3_monitor_0 initiated by a different transitioner
crmd[1566]: 2009/03/26_16:02:45 info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ]
crmd[1566]: 2009/03/26_16:02:45 info: do_state_transition: All 1 cluster nodes are eligible to run resources.
pengine[1976]: 2009/03/26_16:02:46 info: determine_online_status: Node f801.haijiye is online
pengine[1976]: 2009/03/26_16:02:46 notice: group_print: Resource Group: group_1
pengine[1976]: 2009/03/26_16:02:46 notice: native_print:     IPaddr_192_168_1_118        (heartbeat:cf:IPaddr):        Started f801.haijiye
pengine[1976]: 2009/03/26_16:02:46 notice: native_print:     httpd_2        (heartbeat:httpd):        Stopped
pengine[1976]: 2009/03/26_16:02:46 notice: native_print:     MailTo_3        (heartbeat:cf:MailTo):        Stopped
pengine[1976]: 2009/03/26_16:02:46 notice: NoRoleChange: Leave resource IPaddr_192_168_1_118        (f801.haijiye)
pengine[1976]: 2009/03/26_16:02:46 notice: RecurringOp: f801.haijiye           IPaddr_192_168_1_118_monitor_5000
pengine[1976]: 2009/03/26_16:02:46 notice: StartRsc:  f801.haijiye        Start httpd_2
pengine[1976]: 2009/03/26_16:02:46 notice: RecurringOp: f801.haijiye           httpd_2_monitor_120000
pengine[1976]: 2009/03/26_16:02:46 notice: StartRsc:  f801.haijiye        Start MailTo_3
pengine[1976]: 2009/03/26_16:02:46 notice: RecurringOp: f801.haijiye           MailTo_3_monitor_120000
crmd[1566]: 2009/03/26_16:02:46 info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
tengine[1975]: 2009/03/26_16:02:46 info: unpack_graph: Unpacked transition 0: 8 actions in 8 synapses
tengine[1975]: 2009/03/26_16:02:46 info: te_pseudo_action: Pseudo action 11 fired and confirmed
tengine[1975]: 2009/03/26_16:02:46 info: send_rsc_command: Initiating action 5: IPaddr_192_168_1_118_start_0 on f801.haijiye
crmd[1566]: 2009/03/26_16:02:46 info: do_lrm_rsc_op: Performing op=IPaddr_192_168_1_118_start_0 key=5:0:112c0c00-03fb-4f6a-be94-4cf8d83f5cc9)
lrmd[1563]: 2009/03/26_16:02:46 info: rsc:IPaddr_192_168_1_118: start
pengine[1976]: 2009/03/26_16:02:46 info: process_pe_message: Transition 0: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-96.bz2
IPaddr[1977]:        2009/03/26_16:02:46 INFO: Using calculated nic for 192.168.1.118: eth1
IPaddr[1977]:        2009/03/26_16:02:46 INFO: Using calculated netmask for 192.168.1.118: 255.255.255.0
lrmd[1563]: 2009/03/26_16:02:47 info: Managed IPaddr_192_168_1_118:start process 1977 exited with return code 0.
crmd[1566]: 2009/03/26_16:02:47 info: process_lrm_event: LRM operation IPaddr_192_168_1_118_start_0 (call=6, rc=0) complete
tengine[1975]: 2009/03/26_16:02:47 info: match_graph_event: Action IPaddr_192_168_1_118_start_0 (5) confirmed on f801.haijiye (rc=0)
tengine[1975]: 2009/03/26_16:02:47 info: send_rsc_command: Initiating action 6: IPaddr_192_168_1_118_monitor_5000 on f801.haijiye
tengine[1975]: 2009/03/26_16:02:47 info: send_rsc_command: Initiating action 7: httpd_2_start_0 on f801.haijiye
crmd[1566]: 2009/03/26_16:02:47 info: do_lrm_rsc_op: Performing op=IPaddr_192_168_1_118_monitor_5000 key=6:0:112c0c00-03fb-4f6a-be94-4cf8d83f5cc9)
crmd[1566]: 2009/03/26_16:02:47 info: do_lrm_rsc_op: Performing op=httpd_2_start_0 key=7:0:112c0c00-03fb-4f6a-be94-4cf8d83f5cc9)
lrmd[1563]: 2009/03/26_16:02:47 info: rsc:httpd_2: start
lrmd[1563]: 2009/03/26_16:02:47 info: RA output: (httpd_2:start:stdout) Starting httpd:
lrmd[1563]: 2009/03/26_16:02:47 info: Managed IPaddr_192_168_1_118:monitor process 2038 exited with return code 0.
crmd[1566]: 2009/03/26_16:02:47 info: process_lrm_event: LRM operation IPaddr_192_168_1_118_monitor_5000 (call=7, rc=0) complete
tengine[1975]: 2009/03/26_16:02:47 info: match_graph_event: Action IPaddr_192_168_1_118_monitor_5000 (6) confirmed on f801.haijiye (rc=0)
lrmd[1563]: 2009/03/26_16:02:50 info: RA output: (httpd_2:start:stderr) httpd: apr_sockaddr_info_get() failed for f801.haijiye

lrmd[1563]: 2009/03/26_16:02:50 info: RA output: (httpd_2:start:stderr) httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName

lrmd[1563]: 2009/03/26_16:02:52 info: RA output: (httpd_2:start:stdout) [
lrmd[1563]: 2009/03/26_16:02:52 info: RA output: (httpd_2:start:stdout)   OK  
lrmd[1563]: 2009/03/26_16:02:52 info: RA output: (httpd_2:start:stdout) ]
lrmd[1563]: 2009/03/26_16:02:52 info: RA output: (httpd_2:start:stdout)
lrmd[1563]: 2009/03/26_16:02:52 info: RA output: (httpd_2:start:stdout)

lrmd[1563]: 2009/03/26_16:02:52 info: Managed httpd_2:start process 2039 exited with return code 0.
crmd[1566]: 2009/03/26_16:02:52 info: process_lrm_event: LRM operation httpd_2_start_0 (call=8, rc=0) complete
tengine[1975]: 2009/03/26_16:02:52 info: match_graph_event: Action httpd_2_start_0 (7) confirmed on f801.haijiye (rc=0)
tengine[1975]: 2009/03/26_16:02:52 info: send_rsc_command: Initiating action 8: httpd_2_monitor_120000 on f801.haijiye
crmd[1566]: 2009/03/26_16:02:52 info: do_lrm_rsc_op: Performing op=httpd_2_monitor_120000 key=8:0:112c0c00-03fb-4f6a-be94-4cf8d83f5cc9)
tengine[1975]: 2009/03/26_16:02:52 info: send_rsc_command: Initiating action 9: MailTo_3_start_0 on f801.haijiye
crmd[1566]: 2009/03/26_16:02:52 info: do_lrm_rsc_op: Performing op=MailTo_3_start_0 key=9:0:112c0c00-03fb-4f6a-be94-4cf8d83f5cc9)
lrmd[1563]: 2009/03/26_16:02:52 info: rsc:MailTo_3: start
lrmd[1563]: 2009/03/26_16:02:53 info: Managed MailTo_3:start process 2060 exited with return code 0.
crmd[1566]: 2009/03/26_16:02:53 info: process_lrm_event: LRM operation MailTo_3_start_0 (call=10, rc=0) complete
lrmd[1563]: 2009/03/26_16:02:53 WARN: G_SIG_dispatch: Dispatch function for SIGCHLD was delayed 230 ms (> 100 ms) before being called (GSource: 0x93fd1f0)
lrmd[1563]: 2009/03/26_16:02:53 info: G_SIG_dispatch: started at 429454334 should have started at 429454311
lrmd[1563]: 2009/03/26_16:02:53 info: Managed httpd_2:monitor process 2059 exited with return code 0.
crmd[1566]: 2009/03/26_16:02:53 info: process_lrm_event: LRM operation httpd_2_monitor_120000 (call=9, rc=0) complete
lrmd[1563]: 2009/03/26_16:02:53 info: Managed IPaddr_192_168_1_118:monitor process 2065 exited with return code 0.
tengine[1975]: 2009/03/26_16:02:53 info: match_graph_event: Action MailTo_3_start_0 (9) confirmed on f801.haijiye (rc=0)
tengine[1975]: 2009/03/26_16:02:53 info: te_pseudo_action: Pseudo action 12 fired and confirmed
tengine[1975]: 2009/03/26_16:02:53 info: send_rsc_command: Initiating action 10: MailTo_3_monitor_120000 on f801.haijiye
crmd[1566]: 2009/03/26_16:02:53 info: do_lrm_rsc_op: Performing op=MailTo_3_monitor_120000 key=10:0:112c0c00-03fb-4f6a-be94-4cf8d83f5cc9)
tengine[1975]: 2009/03/26_16:02:54 info: match_graph_event: Action httpd_2_monitor_120000 ( confirmed on f801.haijiye (rc=0)
lrmd[1563]: 2009/03/26_16:02:54 info: Managed MailTo_3:monitor process 2100 exited with return code 0.
crmd[1566]: 2009/03/26_16:02:54 info: process_lrm_event: LRM operation MailTo_3_monitor_120000 (call=11, rc=0) complete
tengine[1975]: 2009/03/26_16:02:54 info: match_graph_event: Action MailTo_3_monitor_120000 (10) confirmed on f801.haijiye (rc=0)
tengine[1975]: 2009/03/26_16:02:54 info: run_graph: Transition 0: (Complete=8, Pending=0, Fired=0, Skipped=0, Incomplete=0)
tengine[1975]: 2009/03/26_16:02:54 info: notify_crmd: Transition 0 status: te_complete - <null>
crmd[1566]: 2009/03/26_16:02:54 info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
heartbeat[1528]: 2009/03/26_16:02:55 info: Link f802.haijiye:/dev/ttyS0 dead.
heartbeat[1528]: 2009/03/26_16:02:59 WARN: node f802.haijiye: is dead
heartbeat[1528]: 2009/03/26_16:02:59 info: Link f802.haijiye:eth0 dead.
crmd[1566]: 2009/03/26_16:02:59 notice: crmd_ha_status_callback: Status update: Node f802.haijiye now has status [dead]
lrmd[1563]: 2009/03/26_16:02:59 info: Managed IPaddr_192_168_1_118:monitor process 2119 exited with return code 0.
lrmd[1563]: 2009/03/26_16:03:04 info: Managed IPaddr_192_168_1_118:monitor process 2133 exited with return code 0.
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP