- 论坛徽章:
- 0
|
问题描述:Sun V890双机系统,Solaris 8+ Cluster 3.0 + VxVM 4.1, scstat命令显示NAFO工作不正常。过若干秒后显示degraded。
pnmstat 显示变化为OK->DOUBT->DOWN->OK, NAFO网卡工作不稳定。
[email=root@node-1]root@node-1[/email]
# scstat
------------------------------------------------------------------
-- Cluster Nodes --
Node name Status
--------- ------
Cluster node: node-1 Online
Cluster node: node-2 Online
------------------------------------------------------------------
-- Cluster Transport Paths --
Endpoint Endpoint Status
-------- -------- ------
Transport path: node-1:ce3 node-2:ce3 Path online
Transport path: node-1:ce1 node-2:ce1 Path online
------------------------------------------------------------------
-- Quorum Summary --
Quorum votes possible: 4
Quorum votes needed: 3
Quorum votes present: 4
-- Quorum Votes by Node --
Node Name Present Possible Status
--------- ------- -------- ------
Node votes: node-1 1 1 Online
Node votes: node-2 1 1 Online
-- Quorum Votes by Device --
Device Name Present Possible Status
----------- ------- -------- ------
Device votes: /dev/did/rdsk/d8s2 1 1 Online
Device votes: /dev/did/rdsk/d10s2 1 1 Online
------------------------------------------------------------------
-- Device Group Servers --
Device Group Primary Secondary
------------ ------- ---------
Device group servers: rmt/1 - -
Device group servers: rmt/2 - -
Device group servers: nodeadg node-1 -
Device group servers: nodebdg - -
Device group servers: ossdg node-1 node-2
-- Device Group Status --
Device Group Status
------------ ------
Device group status: rmt/1 Offline
Device group status: rmt/2 Offline
Device group status: nodeadg Online
Device group status: nodebdg Offline
Device group status: ossdg Online
------------------------------------------------------------------
-- Resource Groups and Resources --
Group Name Resources
---------- ---------
Resources: oss_rg node_rs ossdg_rs sybase_rs
-- Resource Groups --
Group Name Node Name State
---------- --------- -----
Group: oss_rg node-1 Online
Group: oss_rg node-2 Offline
-- Resources --
Resource Name Node Name State Status Message
------------- --------- ----- --------------
Resource: node_rs node-1 Online Degraded - NAFO Failure.
Resource: node_rs node-2 Offline Offline - LogicalHostname offline.
Resource: ossdg_rs node-1 Online Online
Resource: ossdg_rs node-2 Offline Offline
Resource: sybase_rs node-1 Online Online - Adaptiver server started
Resource: sybase_rs node-2 Offline Offline - Adaptive_server: STOPPED monitor_server: STOPPED backup_server: STOPPED
[email=root@node-1]root@node-1[/email]
# pnmstat -l
group adapters status fo_time act_adp
nafo0 ce0:ce2 DOWN 48 ce0
[email=root@node-1]root@node-1[/email]
#
解决方法:这个是Sun Cluster的一个bug,在/etc/system加入如下内容
set ce:ce_reclaim_pending=1
set ce:ce_taskq_disable=1
重启系统后nafo完全正常。
总结:该问题困扰了很长时间。搜索公司案例才得到解决。开始以为是安装cluster有问题,为了尝试解决,不惜多次重装操作系统,cluster,vxvm...还是问题依旧。还是假期结束前终于解决了,松了一口气。
本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u1/43930/showart_461434.html |
|