免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
12下一页
最近访问板块 发新帖
查看: 10193 | 回复: 11
打印 上一主题 下一主题

CentOS4.4+GFS+Oracle10g RAC+VMWARE [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2006-12-24 17:31 |只看该作者 |倒序浏览
这几天在安装一个RAC测试环境,使用的是CentOS4.4+GFS6.1+Oracle10g RAC+VMWARE Server1.0.1,经过千辛万苦和chinaunix上的文章的帮助,终于安装完毕,安装时我是参考Oracle_GFS.pdf做的,该文档可到redhat网站下载。 现有以下问题请教:

1、测试环境中,若是有一个node down掉,另一个node也不能访问共享磁盘,也就是gfs文件系统,不知为何?我使用的是lock_dlm
2、因手头没有 fence device ,做cluster时我选择的是 fence_manual,请问 IBM HS21 BladeCenter 中是否已含 fence device 功能呢?准备在实际环境中用HS21

论坛徽章:
0
2 [报告]
发表于 2006-12-25 16:25 |只看该作者
1:gfs有个法定启动台数的限制,最小是2台,你down了1台自然非法了
2:BladeCenter 有fence功能,我记得是访问192.168.70.125就能设置

[ 本帖最后由 fuumax 于 2006-12-25 16:28 编辑 ]

论坛徽章:
0
3 [报告]
发表于 2006-12-26 09:26 |只看该作者
能否把你的安装步骤共享出来

论坛徽章:
0
4 [报告]
发表于 2006-12-26 13:00 |只看该作者

CentOS4.4 + RHCS(DLM) + GFS + Oracle10gR2 RAC + VMWare Server 1.0.1 安装

本文参考了本论坛很多文章,特此致谢!

****************************************************************************
* CentOS4.4 + RHCS(DLM) + GFS + Oracle10gR2 RAC + VMWare Server 1.0.1 安装 *
****************************************************************************

一、测试环境
        主机:一台PC,AMD-64位的芯片,4G内存,安装CentOS-4.4-x86_64版本的操作系统
        在这个主机上面安装了2个虚拟机,全部安装CentOS-4.4-x86_64版本的操作系统,未进行内核定制,网上更新到最新

二、安装 VMWare Server 1.0.1 for linux

三、创建共享磁盘

        vmware-vdiskmanager -c -s 6Gb -a lsilogic -t 2 "/vmware/share/ohome.vmdk"   |用于 Shared Oracle Home
        vmware-vdiskmanager -c -s 10Gb -a lsilogic -t 2 "/vmware/share/odata.vmdk"  |用于 datafiles and indexes
        vmware-vdiskmanager -c -s 3Gb -a lsilogic -t 2 "/vmware/share/oundo1.vmdk"  |用于 node1 Redo logs and Undo tablespaces
        vmware-vdiskmanager -c -s 3Gb -a lsilogic -t 2 "/vmware/share/oundo2.vmdk"  |用于 node2 Redo logs and Undo tablespaces
        vmware-vdiskmanager -c -s 512Mb -a lsilogic -t 2 "/vmware/share/oraw.vmdk"  |用于 Oracle集群注册表文件和CRS表决磁盘

        2个虚拟机使用一个共享磁盘
       
四、安装虚拟机
        1. 在vmware console 创建 vmware guest OS, 取名 gfs-node01, 选择custome create-> Redhat Enterprise Linux 4 64-bit,其它都是默认.
           内存选择1G(>800MB你就看不到warning了), 硬盘大小选择12GB, 建立方式不选择 pre-allocated

        2. 创建好后vmware guest OS之后, 给guest 加上一块NIC(也就是网卡)

        3. 关掉vmware console, 在node1目录下面,打开gfs-node1.vmx, 在最后空白处添加以下内容

scsi1.present = "TRUE"
scsi1.virtualDev = "lsilogic"
scsi1.sharedBus = "virtual"

scsi1:1.present = "TRUE"
scsi1:1.mode = "independent-persistent"
scsi1:1.filename = "/vmware/share/ohome.vmdk"
scsi1:1.deviceType = "disk"

scsi1:2.present = "TRUE"
scsi1:2.mode = "independent-persistent"
scsi1:2.filename = "/vmware/share/odata.vmdk"
scsi1:2.deviceType = "disk"

scsi1:3.present = "TRUE"
scsi1:3.mode = "independent-persistent"
scsi1:3.filename = "/vmware/share/oundo1.vmdk"
scsi1:3.deviceType = "disk"

scsi1:4.present = "TRUE"
scsi1:4.mode = "independent-persistent"
scsi1:4.filename = "/vmware/share/oundo2.vmdk"
scsi1:4.deviceType = "disk"

scsi1:5.present = "TRUE"
scsi1:5.mode = "independent-persistent"
scsi1:5.filename = "/vmware/share/oundo3.vmdk"
scsi1:5.deviceType = "disk"

scsi1:6.present = "TRUE"
scsi1:6.mode = "independent-persistent"
scsi1:6.filename = "/vmware/share/oraw.vmdk"
scsi1:6.deviceType = "disk"

disk.locking = "false"
diskLib.dataCacheMaxSize = "0"
diskLib.dataCacheMaxReadAheadSize = "0"
diskLib.DataCacheMinReadAheadSize = "0"
diskLib.dataCachePageSize = "4096"
diskLib.maxUnsyncedWrites = "0"

        这段是对vmware使用共享硬盘的方式进行定义,大多数人都知道设置 disk.locking ="false" 却漏掉dataCache

        保存退出之后,重新打开你的vmware-console,你就可以看到vmware guest OS的配置中,都有这些硬盘出现了.


五、需要安装的包以及顺序

        可以用yum安装:
        1、升级CentOS4.4
                yum update
        2、安装csgfs
                yum install yumex
                cd /etc/yum.repos.d
                wget [url]http://mirror.centos.org/centos/4/csgfs/CentOS-csgfs.repo[/url]
                yumex

        也可以手动rpm安装:
    包下载地址:[url]http://mirror.centos.org/centos/4/csgfs/x86_64/RPMS/[/url]

        1、在所有节点上安装必须的软件包,软件包完整列表请参考GFS6.1用户手册

rgmanager                                — Manages cluster services and resources
system-config-cluster        — Contains the Cluster Configuration Tool, used to graphically configure the cluster and the display of the current status of the nodes, resources, fencing agents, and cluster services
ccsd                                        — Contains the cluster configuration services daemon (ccsd) and associated files
magma                                        — Contains an interface library for cluster lock management
magma-plugins                        — Contains plugins for the magma library
cman                                        — Contains the Cluster Manager (CMAN), which is used for managing cluster membership, messaging, and notification
cman-kernel                                — Contains required CMAN kernel modules
dlm                                                — Contains distributed lock management (DLM) library
dlm-kernel                                — Contains required DLM kernel modules
fence                                        — The cluster I/O fencing system that allows cluster nodes to connect to a variety of network power switches, fibre channel switches, and integrated power management interfaces
iddev                                        — Contains libraries used to identify the file system (or volume manager) in which a device is formatted Also, you can optionally install Red Hat GFS on your Red Hat Cluster Suite. Red Hat GFS consists of the following RPMs:
GFS                                                — The Red Hat GFS module
GFS-kernel                                — The Red Hat GFS kernel module
lvm2-cluster                        — Cluster extensions for the logical volume manager
GFS-kernheaders                        — GFS kernel header files


        2、安装软件和顺序
安装脚本,install.sh
#!/bin/bash

rpm -ivh kernel-smp-2.6.9-42.EL.x86_64.rpm
rpm -ivh kernel-smp-devel-2.6.9-42.EL.x86_64.rpm

rpm -ivh perl-Net-Telnet-3.03-3.noarch.rpm
rpm -ivh magma-1.0.6-0.x86_64.rpm

rpm -ivh magma-devel-1.0.6-0.x86_64.rpm

rpm -ivh ccs-1.0.7-0.x86_64.rpm
rpm -ivh ccs-devel-1.0.7-0.x86_64.rpm

rpm -ivh cman-kernel-2.6.9-45.4.centos4.x86_64.rpm
rpm -ivh cman-kernheaders-2.6.9-45.4.centos4.x86_64.rpm
rpm -ivh cman-1.0.11-0.x86_64.rpm
rpm -ivh cman-devel-1.0.11-0.x86_64.rpm

rpm -ivh dlm-kernel-2.6.9-42.12.centos4.x86_64.rpm
rpm -ivh dlm-kernheaders-2.6.9-42.12.centos4.x86_64.rpm
rpm -ivh dlm-1.0.1-1.x86_64.rpm
rpm -ivh dlm-devel-1.0.1-1.x86_64.rpm


rpm -ivh fence-1.32.25-1.x86_64.rpm

rpm -ivh GFS-6.1.6-1.x86_64.rpm
rpm -ivh GFS-kernel-2.6.9-58.2.centos4.x86_64.rpm
rpm -ivh GFS-kernheaders-2.6.9-58.2.centos4.x86_64.rpm

rpm -ivh iddev-2.0.0-3.x86_64.rpm
rpm -ivh iddev-devel-2.0.0-3.x86_64.rpm

rpm -ivh magma-plugins-1.0.9-0.x86_64.rpm

rpm -ivh rgmanager-1.9.53-0.x86_64.rpm

rpm -ivh system-config-cluster-1.0.25-1.0.noarch.rpm

rpm -ivh ipvsadm-1.24-6.x86_64.rpm

rpm ivh piranha-0.8.2-1.x86_64.rpm --nodeps


注意:有些包有依赖关系,使用nodeps开关进行安装即可


        3、修改各个节点上的/etc/hosts文件(每个节点都一样)
        如下:
        [root@gfs-node1 etc]# cat hosts
        # Do not remove the following line, or various programs
        # that require network functionality will fail.
        127.0.0.1        localhost.localdomain localhost

                192.168.154.211 gfs-node1
                192.168.154.212 gfs-node2

        192.168.10.1    node1-prv
        192.168.10.2    node2-prv

                192.168.154.201 node1-vip
                192.168.154.202 node2-vip

        注意:主机名、cluster主机名、ocs中的pub节点名最好相同。


六、运行system-config-cluster进行配置

增加2个节点,节点的权置全部设置为1,即Quorum值设置为1

三个节点的名称为:
gfs-node1
gfs-node2

修改cluster.conf文件,如下:

[root@gfs-node1 ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="1" name="alpha_cluster">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="gfs-node1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="F-Man" nodename="gfs-node1" ipaddr="192.168.10.1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="gfs-node2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="F-Man" nodename="gfs-node2" ipaddr="192.168.10.2"/>
                                        </method>
                        </fence>
                </clusternode>
        </clusternodes>
               
        <cman/>

        <fencedevices>
                <fencedevice agent="fence_manual" name="F-Man"/>
        </fencedevices>
        
        <rm>
                <failoverdomains>
                        <failoverdomain name="web_failover" ordered="1" restricted="0">
                                <failoverdomainnode name="gfs-node01" priority="1"/>
                                <failoverdomainnode name="gfs-node02" priority="2"/>
                                <failoverdomainnode name="gfs-node03" priority="3"/>
                        </failoverdomain>
                </failoverdomains>
        </rm>
</cluster>

[注意] Use fence_bladecenter.  This will require that you have telnet enabled on
        your management module (may require a firmware update)

使用scp命令把这个配置文件copy到node2节点上

七、 在01/02节点上启动dlm,ccsd,fence等服务  
        可能在安装配置完上述步骤后,下面这些服务已经起来了。

        在2个节点上加载dlm模块  

        [root@gfs-node1 cluster]# modprobe lock_dlm
        [root@gfs-node2 cluster]# modprobe lock_dlm

        5.2、启动ccsd服务  
        [root@gfs-node1 cluster]# ccsd
        [root@gfs-node2 cluster]# ccsd

        5.3、启动集群管理器(cman)  
        root@gfs-node1 # /sbin/cman_tool join  
        root@gfs-node2 # /sbin/cman_tool join  

        5.4、测试ccsd服务  
        (注意:ccsd的测试要等cman启动完成后,然后才可以进行下面的测试

        [root@gfs-node1 cluster]# ccs_test connect
        [root@gfs-node2 cluster]# ccs_test connect

        # ccs_test connect 各个节点的返回如下:
        node 1:
        [root@gfs-node1 cluster]# ccs_test connect
        Connect successful.
        Connection descriptor = 0
        node 2:
        [root@gfs-node2 cluster]# ccs_test connect
        Connect successful.
        Connection descriptor = 30

        5.5、查看节点状态
        cat /proc/cluster/nodes,应该返回  
        [root@gfs-node1 cluster]# cat /proc/cluster/nodes
        Node  Votes Exp Sts  Name
          1    1    3   M   gfs-node1
          2    1    3   M   gfs-node2

        [root@gfs-node1 cluster]#

八、加入fence域:  
[root@gfs-node1 cluster]# /sbin/fence_tool join
[root@gfs-node2 cluster]# /sbin/fence_tool join


九、查看集群状态
Node 1:
[root@gfs-node1 cluster]# cat /proc/cluster/status
Protocol version: 5.0.1
Config version: 1
Cluster name: alpha_cluster
Cluster ID: 50356
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 3
Expected_votes: 3
Total_votes: 3
Quorum: 2   
Active subsystems: 1
Node name: gfs-node1
Node ID: 1
Node addresses: 192.168.10.1

Node 2
[root@gfs-node2 cluster]# cat /proc/cluster/status
Protocol version: 5.0.1
Config version: 1
Cluster name: alpha_cluster
Cluster ID: 50356
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 3
Expected_votes: 3
Total_votes: 3
Quorum: 2   
Active subsystems: 1
Node name: gfs-node2
Node ID: 2
Node addresses: 192.168.10.2

十、在node-1上进行分区
        #dmesg |grep scsi察看scsi设备,如下:
    [root@gfs-node1 ~]# dmesg | grep scsi
        scsi0 : ioc0: LSI53C1030, FwRev=00000000h, Ports=1, MaxQ=128, IRQ=169
        Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
        [root@gfs-node1 ~]# pvcreate /dev/sdb
                Physical volume "/dev/sdb" successfully created
        [root@gfs-node1 ~]# pvcreate /dev/sdc
                Physical volume "/dev/sdc" successfully created
        [root@gfs-node1 ~]# pvcreate /dev/sdd
                Physical volume "/dev/sdd" successfully created
        [root@gfs-node1 ~]# pvcreate /dev/sde
                Physical volume "/dev/sde" successfully created
       
        [root@gfs-node1 ~]# system-config-lvm
                physical extent size 改为 128k
                sdb -> common -> ohome
                sdc -> oradata -> datafiles
                sdd -> redo1 -> log1
                sde -> redo2 -> log2

[注意] 在pvcreate之前也可以先用fdisk分区

十一、创建GFS文件系统
        [root@gfs-node1 ~]# mkfs.gfs -J 32 -j 4 -p lock_dlm -t alpha_cluster:log1 /dev/redo1/log1
        [root@gfs-node1 ~]# mkfs.gfs -J 32 -j 4 -p lock_dlm -t alpha_cluster:log2 /dev/redo2/log2
        [root@gfs-node1 ~]# mkfs.gfs -J 32 -j 4 -p lock_dlm -t alpha_clusterhome /dev/common/ohome
        [root@gfs-node1 ~]# mkfs.gfs -J 32 -j 4 -p lock_dlm -t alpha_cluster:datafiles /dev/oradata/datafiles

        查看:
        dmesg | grep scsi
        lvscan

        修改 /etc/fstab 文件:在文件末尾添加
        /dev/common/ohome       /dbms/ohome     gfs _netdev 0 0
        /dev/oradata/datafiles  /dbms/oradata   gfs _netdev 0 0
        /dev/redo1/log1         /dbms/log1      gfs _netdev 0 0
        /dev/redo2/log2         /dbms/log2      gfs _netdev 0 0

        The _netdev option is also useful as it insures the filesystems are un-mounted before cluster services shutdown.
        2个节点都要修改 fstab 文件. /dbms 及其子目录都要手工创建好。

十二、创建RAW分区
        The certified version of Oracle 10g on GFS requires that the two clusterware files be located on shared raw partitions and
        be visible by all RAC nodes in the cluster.
        [root@gfs-node1 ~]# fdisk /dev/sdg
        创建2个256M的raw device

        If the other nodes were already up and running while you created these partitions, these other nodes must re-read the partition
        table from disk:
        [root@gfs-node2 ~]# blockdev --rereadpt /dev/sdg

        Make sure the service rawdevices is enabled on all three RAC nodes for the run level that will be used. This example enables
it for both run levels. Run:
rac1 # chkconfig –level 35 rawdevices on
The mapping occurs in the files /etc/sysconfig/rawdevices
# raw device bindings
# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5
/dev/raw/raw1 /dev/sdg1
/dev/raw/raw2 /dev/sdg2
The permissions of these files must always be owned by the oracle user used to install the software (oracle). A 10
second delay is needed to insure that the rawdevices service has a chance to configure the /dev/raw directory. Add
these lines to the /etc/rc.local file. This file is symbolically linked to /etc/rc?.d/S99local.
echo "Sleep a bit first and then set the permissions on raw"
sleep 10
chown oracle:dba /dev/raw/raw1
chown oracle:dba /dev/raw/raw2

十二、修改 /etc/sysctl.conf

kernel.shmmax = 4047483648
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 1024 65000
fs.file-max = 65536
##
This is for Oracle RAC core GCS services
#
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 1048576
net.core.wmem_max = 1048576

十三、Create the oracle user
[root@gfs-node1 ~]# groupadd oinstall
[root@gfs-node1 ~]# groupadd dba
[root@gfs-node1 ~]# useradd oracle -g oinstall -G dba

配置 /etc/sudoers 文件,so that oracle admin users can safely execute root commands:
# User alias specification
User_Alias SYSAD=oracle, oinstall
User_Alias USERADM=oracle, oinstall
# User privilege specification
SYSAD ALL=(ALL) ALL
USERADM ALL=(root) NOPASSWD:/usr/local/etc/yanis.client
root ALL=(ALL) ALL

每个节点都要做以上工作。

十四、oracle用户 Create_a_clean_ssh_connection_environment
        1、在每个节点上执行 ssh-keygen –t dsa 按回车直到执行完毕
        2、在node1 collect up all the ~/.ssh/id_dsa.pub files into one ~/.ssh/authorized_keys file and distribute this to the other three nodes:
                ssh gfs-node1 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
                ssh gfs-node2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
                scp ~/.ssh/authorized_keys gfs-node2:~/.ssh

        3、执行一遍以下命令:
                 [oracle@gfs-node1 ~]$ ssh gfs-node1 date
                 [oracle@gfs-node1 ~]$ ssh node1-prv date
                 [oracle@gfs-node1 ~]$ ssh gfs-node2 date
                 [oracle@gfs-node1 ~]$ ssh node2-prv date

                node2节点同样做一遍


之后就是安装 oracle10g 了,注意选取 cluster 方式安装。

论坛徽章:
0
5 [报告]
发表于 2006-12-26 13:03 |只看该作者

回复 2楼 fuumax 的帖子

gfs 文件系统强制最少两台服务器是真的么?那如果在工作环境中有一台坏掉了,岂不是另外一台也没法工作?那双机还有什么意义呢?

论坛徽章:
0
6 [报告]
发表于 2006-12-27 08:58 |只看该作者
应该通过心跳功能完成自动切换

论坛徽章:
0
7 [报告]
发表于 2007-01-05 23:33 |只看该作者
RAC不存在切换的问题的!
RAC是为了高可用性而设计的。

论坛徽章:
0
8 [报告]
发表于 2007-01-29 00:09 |只看该作者
正准备实战,楼主的经验对我很有用。谢了!

论坛徽章:
0
9 [报告]
发表于 2007-04-11 21:09 |只看该作者
这个fence是做什么的?cluster.conf中必须要配置么?

论坛徽章:
0
10 [报告]
发表于 2007-11-09 09:38 |只看该作者
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP