论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2011-12-10 22:08 |只看该作者 |倒序浏览

[MySQL集群架构] drbd+heartbeat+mysql主从高可用

DRBD+heartbeat+mysql 高可用

基于Redhat编译或rpm，CentOS等系统之间yum 安装

制作RPM方式安装：

准备2台机器做主从，首先修改2台机器的hostname

Ip：192.168.1.252
192.168.1.249

2台机器的内核：

[root@linux252 drdb]# uname -a

Linux linux252 2.6.9-67.ELsmp #1 SMP Wed Nov 7 13:58:04 EST 2007 i686 i686 i386 GNU/Linux

[root@linux249 ~]# uname -a

Linux linux249 2.6.9-67.ELsmp #1 SMP Wed Nov 7 13:58:04 EST 2007 i686 i686 i386 GNU/Linux

[root@linux252 ~]# hostname

linux252

[root@linux249 ~]# hostname

linux249

[root@linux252 ~]# cat /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1
localhost.localdomain localhost

192.168.1.252
linux252

192.168.1.249
linux249

[root@linux249 ~]# cat /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1
localhost.localdomain localhost

192.168.1.252
linux252

192.168.1.249
linux249

准备软件包：http://oss.linbit.com/drbd/8.3/drbd-8.3.1.tar.gz

ftp://ftp.eenet.ee/pub/gentoo/distfiles/libnet-1.1.2.1.tar.gz

http://www.ultramonkey.org/downl ... rtbeat-2.1.3.tar.gz

安装drbd：

可以编译安装，我这里选择了制作rpm包安装,讲源码包制作为RPM包安装。

tar zxvf drbd-8.3.4.tar.gz

cd drbd-8.3.4

cp drbd.spec.in drbd.spec

make rpm KDIR=/usr/src/kernels/2.6.9-67.EL-smp-i686/

cd dist/RPMS/i386/

rpm -ivh drbd-8.3.4-3.i386.rpm

rpm -ivh drbd-km-2.6.9_67.ELsmp-8.3.4-3.i386.rpm

先安装libnet

./configure

Make;make install

再安装drbd：

make all
make install
make install-tools

检查是否加载到内核：

Modprobe drbd

Lsmod|grep drbd(有结果说明加载成功)

创建供DRBD记录信息的数据块.分别在两台主机上执行：

# drbdadm create-md r0

--==
Thank you for participating in the global usage survey
==--

The server's response is:

Writing meta data...

initializing activity log

NOT initialized bitmap

New drbd meta data block successfully created.

说明：出现以上信息，表示创建成功

注意：

1)
“r0”是在drbd.conf里定义的资源名称.

2)
当执行命令”drbdadm create-md r0”时，出现以下错误信息。

Device size would be truncated, which

would corrupt data and result in

'access beyond end of device' errors.

You need to either

* use external meta data (recommended)

* shrink that filesystem first

* zero out the device (destroy the filesystem)

Operation refused.

Command 'drbdmeta 0 v08 /dev/xvdb internal create-md' terminated with exit code 40

drbdadm create-md r0: exited with code 40

解决办法：初始化磁盘文件格式, dd if=/dev/zero bs=1M count=1 of=/dev/sdXYZ; sync

Tip：如果出现code 50错误，使用dd命令将硬盘破坏

dd if=/dev/zero of=/dev/sda2 bs=1M count=1

在2台机器上面同时执行启动

# /etc/init.d/drbd start

注意：启动Master上的drbd侯，就去启动backup的drbd，否则Master无法启动

查看DRBD的状态

# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:27

0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----

ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:4008024

”/proc/drbd”中显示了drbd当前的状态.第一行的st表示两台主机的状态,都是”备机”状态.

ds是磁盘状态,都是”不一致”状态.

这是由于,DRBD无法判断哪一方为主机,以哪一方的磁盘数据作为标准数据.所以,我们需要初始化一个主机.

Field
说明                值：

cs
连接状态             出现的值：

o Unconfigured：设备在等待配置。

o Unconnected：连接模块时的过渡状态。

o WFConnection：设备等待另一测的配置。

o WFReportParams：过渡状态，等待新TCP 连接的第一个数据包时。.

o SyncingAll：正将主节点的所有模块复制到次级节点上。.

o SyncingQuick：通过复制已被更新的模块（因为现在次级节点已经离开了集群）来更新次级节点。

o Connected：一切正常。

o Timeout：过渡状态。

st
状态（设备的作用）    可能的值为：

o 本地/远程一级状态

o 二级状态

o 未知（这不是一种作用）

ns
网络发送模块号码

nr
网络接收模块号码

dw
磁盘写入模块号码

dr
磁盘读取模块号码

of
运行中（过时的）模块号码

pe
待解决的模块号码

ua
未答复的模块号码（最好为0）

在主服务器上面上执行, 此命令只在主机上执行

# drbdsetup /dev/drbd0 primary -o
#定义为主节点

再次查看状态

# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:27

0: cs:SyncSource rorimary/Secondary ds:UpToDate/Inconsistent C r----

ns:338940 nr:0 dw:0 dr:346240 al:0 bm:20 lo:1 pe:163 ua:229 ap:0 ep:1 wo:b oos:3674296

[>...................] sync'ed:
8.4% (3674296/4008024)K delay_probe: 21

finish: 0:01:06 speed: 55,620 (55,620) K/sec

在备服务器上看到正在进行数据同步

# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:27

0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r----

ns:0 nr:481216 dw:476096 dr:0 al:0 bm:28 lo:161 pe:6116 ua:160 ap:0 ep:1 wo:b oos:3531928

[=>..................] sync'ed: 12.0% (3531928/4008024)K queue_delay: 0.1 ms

finish: 0:01:06 speed: 52,896 (52,896) want: 204,800 K/sec

说明：

主备机状态分别是"主/备",主机磁盘状态是"实时",备机状态是"不一致".

在第3行,可以看到数据正在同步中,即主机正在将磁盘上的数据,传递到备机上.现在的进度是[>...................] sync'ed:
0.4% (1040316/1040316)K

稍等一会,在数据同步完后,再查看一下ha1的DRBD状态:

磁盘状态都是"实时",表示数据同步完成了

在主服务器Master DRBD:10.10.10.176上面执行格式化,挂载操作

# mkfs.ext3 /dev/drbd0

# mkdir /db2

# mount /dev/drbd0 /db2

[root@linux249 ~]# df -ha

Filesystem
Size
Used Avail Use% Mounted on

/dev/mapper/VolGroup00-LogVol00

49G
3.3G
43G
8% /

none
0
0
0
-
/proc

none
0
0
0
-
/sys

none
0
0
0
-
/dev/pts

usbfs
0
0
0
-
/proc/bus/usb

/dev/hda1
99M
13M
81M
14% /boot

none
252M
0
252M
0% /dev/shm

none
0
0
0
-
/proc/sys/fs/binfmt_misc

sunrpc
0
0
0
-
/var/lib/nfs/rpc_pipefs

/dev/drbd0
20G
193M
19G
2% /db2

安装heartbeat

编译安装

groupadd haclient
useradd -g haclient hacluster（先加用户和组）

. /ConfigureMe configure --enable-fatal-warnings=no
./ConfigureMe make --enable-fatal-warnings=no

Make;make install

# cp /usr/share/doc/heartbeat-2.1.4/haresources /etc/ha.d

# cp /usr/share/doc/heartbeat-2.1.4/ha.cf /etc/ha.d

# cp /usr/share/doc/heartbeat-2.1.4/authkeys /etc/ha.d

# chmod 600 /etc/ha.d/authkeys

[root@linux252 drdb]# vi /etc/ha.d/authkeys

auth 1

1 crc

[root@linux252 drdb]# cat /etc/ha.d/ha.cf

debugfile /var/log/ha-debug

logfile /var/log/ha-log

logfacility local0

keepalive 2

deadtime 6

warntime 4

initdead 12

auto_failback on

node linux252

node linux249

udpport 694

ucast eth0 192.168.1.249

ping_group group1 192.168.1.252 192.168.1.249

respawn hacluster /usr/lib/heartbeat/ipfail

apiauth ipfail gid=haclient uid=hacluster

hopfudge

[root@linux252 drdb]# cat /etc/ha.d/haresources

linux252 drbddisk::r0 Filesystem::/dev/drbd0::/db2::ext3 192.168.1.248 mysqld

linux252
当前primary节点名（uname -n）

drbddisk

告诉heartbeat要管理drbd的资源

Filesystem
这里是告诉heartbeat需要管理文件系统资源，其实实际上就是执行mount/umount命令，后面的“::”符号之后是跟的 Filesystem的参数设备名和mount点）

/dev/drbd0::/db2: 这个地方启动heartbeat后heartbeat会自动mount执行。

Mysqld 这个文件是自动启动mysql的需要把mysql的启动文件复制到/etc/ha.d/resource.d/mysqld这样heartbeat才能自动启动这个文件。从而启动mysql服务。

Heartbeat自动执行mount和启动mysql的服务。不需要手动去启动drbd和mysql。

linux252 drbddisk::r0 Filesystem::/dev/drbd0::/db2::ext3 192.168.1.248 mysqld （这里的ip是vip会根据主从的宕机情况进行浮动）

主从2台机器heartbeat都启动起来，然后启动主用ip add看浮动vip是否在哪台机器上，哪台就是主，主宕机了，从接替vip来代替主工作。

可以宕机测试主从接管情况。

参考文档：

http://bbs.linuxtone.org/forum-v ... highlight-drbd.html

http://www.wenzizone.cn/?p=282

http://phorum.study-area.org/index.php?topic=56862.0;wap2

其他相关介绍：

配置文件：

DRBD部分配置

主：192.168.1.252

[root@linux252 drdb]# cat /etc/drbd.conf

global {

usage-count yes;

}

common {

syncer { rate 10M; }

}

resource r0 {

protocol C;

handlers {

pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";

fence-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";

}

disk {

on-io-error
detach;

}

startup {

#
allow-two-primaries;

become-primary-on both;

}

net {

allow-two-primaries;

after-sb-0pri disconnect;

after-sb-1pri disconnect;

after-sb-2pri disconnect;

rr-conflict disconnect;

}

syncer {

rate 10M;

al-extents 257;

}

on linux252 {

device
/dev/drbd0;

disk
/dev/hdb;

address
192.168.1.252:7788;

flexible-meta-disk
internal;

}

on linux249 {

device
/dev/drbd0;

disk
/dev/hdb;

address
192.168.1.249:7788;

meta-disk internal;

}

}

[root@linux252 drdb]#

Heartbeat部分配置：

[root@linux252 drdb]# cat /etc/ha.d/ha.cf

debugfile /var/log/ha-debug

logfile /var/log/ha-log

logfacility local0

keepalive 2

deadtime 6

warntime 4

initdead 12

auto_failback on

node linux252

node linux249

udpport 694

ucast eth0 192.168.1.249

ping_group group1 192.168.1.252 192.168.1.249

respawn hacluster /usr/lib/heartbeat/ipfail

apiauth ipfail gid=haclient uid=hacluster

hopfudge

[root@linux252 drdb]# cat /etc/ha.d/authkeys

auth 1

1 crc

[root@linux252 drdb]# cat /etc/ha.d/haresources

linux252 drbddisk::r0 Filesystem::/dev/drbd0::/db2::ext3 192.168.1.248 mysqld

192.168.1.249

从的配置文件：

[root@linux249 ~]# cat /etc/drbd.conf

global {

usage-count yes;

}

common {

syncer { rate 10M; }

}

resource r0 {

protocol C;

handlers {

pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";

fence-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";

}

disk {

on-io-error
detach;

}

startup {

#allow-two-primaries;

become-primary-on both;

}

net {

allow-two-primaries;

after-sb-0pri disconnect;

after-sb-1pri disconnect;

after-sb-2pri disconnect;

rr-conflict disconnect;

}

syncer {

rate 10M;

al-extents 257;

}

on linux252 {

device
/dev/drbd0;

disk
/dev/hdb;

address
192.168.1.252:7788;

flexible-meta-disk
internal;

}

on linux249 {

device
/dev/drbd0;

disk
/dev/hdb;

address
192.168.1.249:7788;

meta-disk internal;

}

}

运维