- 论坛徽章:
- 0
|
半年前做的集群,系统是rhel4u6。一直跑的很好,最近跑了一个比较大的程式,node1和node2同时死机。我发现日志osm。log中有报错。具体如下:
Aug 14 14:11:54 649528 [9558BA80] -> OpenSM Rev penib-3.0.13
Aug 14 14:11:54 649585 [9558BA80] -> OpenSM Rev penib-3.0.13
Aug 14 14:11:54 679266 [9558BA80] -> osm_vendor_bind: Binding to port 0x2c90200268ab1
Aug 14 14:11:54 680988 [9558BA80] -> osm_vendor_bind: Binding to port 0x2c90200268ab1
Aug 14 14:11:54 685171 [43204960] -> Entering STANDBY state
Aug 14 14:11:55 109635 [42803960] -> Entering MASTER state
Aug 14 14:11:55 114025 [41401960] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0000 GID:0xfe80000000000000,0x0002c90200268ab1
Aug 14 14:11:55 114055 [41401960] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0000 GID:0xfe80000000000000,0x0002c90200268ab1
Aug 14 14:11:55 121908 [42803960] -> osm_ucast_mgr_process: null (min-hop) tables configured on all switches
Aug 14 14:11:55 322501 [41E02960] -> SUBNET UP
Aug 14 14:12:14 865388 [43C05960] -> __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x00000000000130c7, MGID: 0xffffffffffff0000 : 0x194a1480ffffffff from port 0x0002c90200268b41 (MT25204 InfiniHostLx Mellanox Technologies)
Aug 14 14:12:15 867097 [44606960] -> __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x00000000000130c7, MGID: 0xffffffffffff0000 : 0x0000000000000000 from port 0x0002c90200268b41 (MT25204 InfiniHostLx Mellanox Technologies)
Aug 14 14:12:16 867109 [41401960] -> __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x00000000000130c7, MGID: 0xffffffffffff0000 : 0x0000000000000000 from port 0x0002c90200268b41 (MT25204 InfiniHostLx Mellanox Technologies)
Aug 14 14:12:17 868109 [45A08960] -> __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x00000000000130c7, MGID: 0xffffffffffff0000 : 0x0000000000000000 from port 0x0002c90200268b41 (MT25204 InfiniHostLx Mellanox Technologies)
Aug 14 14:12:18 887693 [43C05960] -> __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x00000000000130c7, MGID: 0xffffffffffff0000 : 0x194a1480ffffffff from port 0x0002c90200268b41 (MT25204 InfiniHostLx Mellanox Technologies)
Aug 14 14:12:20 895271 [45007960] -> __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x00000000000130c7, MGID: 0xffffffffffff0000 : 0x194a1480ffffffff from port 0x0002c90200268b41 (MT25204 InfiniHostLx Mellanox Technologies)
哪位大侠,可否帮我解释下,是什么意思啊?google大法找了好久,没有找到。 |
|