- 论坛徽章:
- 0
|
十四、 应用部分
70 如何添加一个应用程序资源 ?
答:添加一个应用程序资源即注册一个普通类型(gds)的新资源,可使用网页操作,也可使用scsetup命令操作。需要提供一个start脚本,一个stop脚本,probe脚本为可选,如果编写的probe脚本只是用来检测应用程序是否存在(ps看进程名),则不必添加,sun cluster已经在监控,标准的probe脚本应该要检测应用程序是否工作正常,并设置正确的返回值。
步骤:
1、# scsetup
*** Main Menu ***
Please select from one of the following options:
1) Quorum
2) Resource groups
3) Data Services
4) Cluster interconnect
5) Device groups and volumes
6) Private hostnames
7) New nodes
Other cluster properties
?) Help with menu options
q) Quit
Option: 2
*** Resource Group Menu ***
Please select from one of the following options:
1) Create a resource group
2) Add a network resource to a resource group
3) Add a data service resource to a resource group
4) Resource type registration
5) Online/Offline or Switchover a resource group
6) Enable/Disable a resource
7) Change properties of a resource group
Change properties of a resource
9) Remove a resource from a resource group
10) Remove a resource group
11) Clear the stop_failed error flag from a resource
?) Help
s) Show current status
q) Return to the main menu
Option: 3
2、选择 3) Add a data service resource to a resource group
3、选择5) SUNW.gds:3.1
>>> Add a Data Service Resource to a Resource Group <<<
This option allows you to add a data service resource to a resource
group. If the resource type for the data service is not yet
registered with the cluster, you will have the opportunity to
register that type.
Is it okay to continue (yes/no) [yes]?
Select the type of resource you want to add:
Res Name Description
======== ===========
1) SUNW.Event HA Event server for Sun Cluster
2) SUNW.HAStorage HA Storage Resource Type
3) SUNW.HAStoragePlus HA Storage Plus - A Resource Type which sub ...
4) SUNW.RGOffload Offload Resource Group
5) SUNW.gds:3.1 Generic Data Service for Sun Cluster
6) SUNW.test test server for Sun Cluster
Option: 5
4、输入资源名称
What is the name of the resource you want to add? freesm
5、选择要加入的资源组
Select the resource group you want to use for "freesm":
Group Name Type Description
========== ==== ===========
1) cluster Failover
6、设置资源属性值
Are you done setting properties (yes/no) [yes]?
Here is the list of extension properties you want to set:
Start_command=/home/oracle/yjr/freesm/bin/start.sh
Stop_command=/home/oracle/yjr/freesm/bin/stop.sh
Probe_command=/home/oracle/yjr/freesm/bin/probe.sh
Is it correct (yes/no) [yes]?
Is it okay to proceed with the update (yes/no) [yes]?
scrgadm -a -j freesm -g freesm -t SUNW.gds:3.1 -y Scalable=false -y Port_list=21/tcp -x Start_command="/home/oracle/yjr/freesm/bin/start.sh" -x Stop_command="/home/oracle/yjr/freesm/bin/stop.sh" -x Probe_command="/home/oracle/yjr/freesm/bin/probe.sh"
scrgadm -c -j freesm -y R_description="Scalable data service resource for SUNW.gds:3.1"
Commands completed successfully.
7、 激活该资源
Do you want to enable this resource (yes/no) [yes]?
scswitch -e -j freesm
scswitch -e -M -j freesm
Commands completed successfully.
8、 将资源组online:scswitch -z -g freesm -h goalnet-32
*** Resource Group Menu ***
Please select from one of the following options:
1) Create a resource group
2) Add a network resource to a resource group
3) Add a data service resource to a resource group
4) Resource type registration
5) Online/Offline or Switchover a resource group
6) Enable/Disable a resource
7) Change properties of a resource group
Change properties of a resource
9) Remove a resource from a resource group
10) Remove a resource group
11) Clear the stop_failed error flag from a resource
?) Help
s) Show current status
q) Return to the main menu
Option: 5
>>> Online/Offline or Switchover a Resource Group <<<
Use this option to bring a resource group online or offline on one or
more cluster nodes. For failover resource groups which are already
online, use this option to switch the primary owner of a group. For
scalable groups which are already online, use it to change the set of
current primaries.
This option is also used to control the managed and unmanaged state
of a resource group.
Once a resource group is brought online, all enabled resources in
that group become available to clients.
Once a resource group is taken offline from all cluster nodes, the
resources in that group are no longer available to clients of those
resources.
Is it okay to continue (yes/no) [yes]?
Select the resource group you want to change:
Group Name Type State(s)
========== ==== ========
1) freesm failover Online
2) cluster failover Online
q) Done
Option: 1
1) Switch ownership of the group
2) Bring the group offline from all cluster nodes
3) Put the group into an unmanaged state
q) Quit
Option: 1
Select the node to take ownership of "freesm":
1) goalnet-32
q) Done
Option: 1
Switching primary ownership of a failover resource group typically
results in a brief outage of service for all resources in the group.
Are you sure you want to switch "freesm" (yes/no) [yes]?
scswitch -z -g freesm -h goalnet-32
Command completed successfully.
71 Sun Cluster 2.2下如何修改节点名?
答:1.先停掉sc2.2,在停之前需要将data service和逻辑主机删掉;
2.备份sc2.2数据库文件以防万一:
# cd /etc/opt/SUNWcluster
# tar cvf conf.tar conf
3.修改两台主机中与主机名有关的文件:
/etc/hosts
/etc/hostname.interface
/etc/nodename
/etc/net/ticlts/hosts
/etc/net/ticots/hosts
/etc/net/ticotsord/hosts
4.reboot两台主机
5.系统启动后不用启动cluster,直接用下面命令察看原来节点信息:
# scconf clustername -p
Current Configuration for Cluster clustername:
Hosts in cluster:
6.然后用下面命令修改即可:
# scconf clustername -h newhostname1 newhostname2
7.察看更改过是否生效:
# scconf clustername -p
8.启动cluster2.2主机:
# scadmin startcluster newhostname1 clustername
9.启动cluster2.2备机:
# scadmin startnode
10.启动正常,一切ok。
72 如何更改 NAFO的网口?
答:假设原来的nafo口为qfe3,更改为ge1,ge0为数据应用口。
#cd /etc
#mv hostname.qfe3 hostname.ge0 两台都做
#scshutdown -y -g0 只做一台,到ok态
#boot -x ## cluster 不起 两台都做
#pnmset -c nafo0 -o delete 两台都做,删除原来的配置
#pnmset -c nafo0 -o create ge0 ge1 两台都做,创建新的配置
#pnmset -p 两台都做 察看nafo情况
#cd /etc/cluster
#vi pnmconfig 将原来的nafo0 qfe3换成nafo0 ge0 ge1
#reboot
73 调整Fault Monitors (程序监控) ?
答:1、Setting the interval between fault monitor probes 修改Thorough_probe_interval值
2、Setting the timeout for fault monitor probes 修改Probe_timeout值
3、Defining the criteria for persistent faults 默认值为3,当应用程序异常退出时,sun cluster认为这是非关键的错误,尝试在本地重启,并且计数器加1,如果经过3次尝试后失败,则认为这是永久的错误,开始切换资源组。
修改以下参数:
Retry_count
Retry_interval
4、Specifying the failover behavior of a resource
Failover_mode有五种选项:NONE,SOFT,HARD,RESTART_ONLY,LOG_ONLY
不同的选项影响不同的故障切换操作:
1、 NONE:如果资源的start方法失败,则双机软件将这个资源的状态设置为“methmod fail”并且等待用户干预;如果资源的stop方法失败,则将这个资源的状态设置为stop
fail,同时将资源组状态设置为Error_stop_failed,并且等待用户干预。
2、 SOFT:如果资源的start方法失败,则双机将在另一台主机上启动该资源,切换整个资源组;如果资源的stop方法失败,则将这个资源的状态设置为stop fail,同时将资源组状态设置为Error_stop_failed,并且等待用户干预。
3、 HARD:如果资源的start方法失败,则双机将在另一台主机上启动该资源,切换整个资源组;如果资源的stop方法失败,则停用该节点,并且将资源组切换到另一节点。
4、 RESTART_ONLY:指明如果出现任何错误,双机软件只能在本地重启该应用程序(资源),不能重启整个资源组或切换资源组, 如果超过了retry_count,则不再有任何资源重启的动作。
5、 LOG_ONLY:指明如果出现错误,双机软件只记录该事件,不做任何切换和重启资源的操作。
十五、 命令部分
74 查看资源组状态
#scstat –g
75 查看磁盘组状态
#scstat –D
76 查看投票设备状态
#scstat –q
77 查看心跳线状态
#scstat -W
78 查看所有主机状态
#scstat –n
79 查看某一主机状态
#scstat –h hostname
80 查看IP地址状态
#scstat -i
81 查看所有状态
#scstat -pv
82 显示资源信息
scrgadm –pv –j mydb
scrgadm –pvv –j mydb
scrgadm –pv
scrgadm -pvv
83 显示双机配置
scconf -pv
84 建立一个资源组
#scrgadm -a -g resource-group-1 -h phys-schost1,phys-schost-2
#scrgadm -pv -g resource-group-1
85 将资源组online
scswitch –Z –g smppcluster
86 将资源组在所有节点上offline
scswitch –F –g smppcluster
87 将资源组切换到某一主机
scswitch –z –g smppcluster –h SMPP02
88 将资源组设为非管理状态
# scswitch -u -g resource-group
89 将资源组设为管理状态
scswitch –o –g resource-group
90 将一个节点从资源组中删除
scrgadm -c -g failover-resource-group -h nodelist
nodelist不要包含被删除的节点名
91 显示资源组信息
scrgadm –pv –g smppcluster
scrgadm –pvv –g smppcluster
92 添加一个IP地址到资源组
scrgadm -a -L [-j resource] -g resource-group -l hostnamelist, ... [-n netiflist]
93 如何查看NAFO情况?
#pnmstat –p
94 如何关闭整个cluster 到ok 态?
#scshutdown -g0 -y |
|