- 论坛徽章:
- 0
|
我最近也在折腾这个东西,我的课题是xen+pacemaker+drbd实现双节点备援,PV DomU的双节点自动切换很容易达成,但是HVM DomU的备援没有完成。HVM DomU采用的是迂回策略,结合clvm+gfs2来保证两个节点数据在都是master的情况下一致,个人认为既然两边的数据能保证完全一致,HVM DomU的备援应当没有问题,但是遗憾的是,利用crm控制台切换时,不能成功,它会在原节点上开始迁移,但是很快会失败,并在原节点重启这个DomU服务,不知道大家有何高见。另外单机上利用do dd if=/dev/zero of=/root/bigfile bs=512M count=2 oflag=direct,磁盘能达到120M/s,而在drbd设备上则降为40M/s(千兆网),这个下降大惊人了,即便使用gfs2,也没有改善,何解?以下是我的日志,请大家赐教
Xen,DRBD,Openais,Pacemaker,Corosync On Debian 6.0 For Two-node Clusters Fail-over
By Wang Xiantong,xiantong at gmail dot com
date:2012.4.18
Dom0
首先安装hypervisor,xen kernel,xen-tools,xen-docs
# apt-get install xen-hypervisor-4.0-amd64 linux-image-xen-amd64 -y
# apt-get install xen-docs-4.0
Caution: 如果是32位系统,请安装相应的32版本
# apt-install xen-hypervisor-4.0-i386 linux-image-xen-686
稍后我们会运行win系统DomU,所以必须要HVM支持
# apt-get install xen-qemu-dm-4.0 -y
Debian 6.0(squeeze)采用grub2,让系统默认从刚刚安装的xen内核引导
# mv -i /etc/grub.d/10_linux /etc/grub.d/50_linux
# update-grub2
禁止Dom0系统探测其它卷的引导记录,这将防止grub引导菜单添加系统卷上DomU的引导内核,当然如果主机是多引导的,如windows也将被禁止
# echo "" >> /etc/default/grub
# echo "# Disable OS prober to prevent virtual machines on logical volumes from appearing in the boot menu." >> /etc/default/grub
# echo "GRUB_DISABLE_OS_PROBER=true" >> /etc/default/grub
# update-grub2
默认情况下,dom0关机或重启会尝试保存正在运行的domU状态,有时这一特性会导致问题,关闭这一特性,修改/etc/default/xendomains
XENDOMAINS_RESTORE=false
XENDOMAINS_SAVE=""
# sed -i.bak "s/^XENDOMAINS_RESTORE.*/XENDOMAINS_RESTORE=false/g" /etc/default/xendomains
# sed -i.bak "s/^XENDOMAINS_SAVE.*/XENDOMAINS_SAVE=""/g" /etc/default/xendomains
# scp /etc/default/xendomains 192.168.1.16:/etc/default/
修改/etc/xen/xend-config.sxp,以启用network bridge
(network-script 'network-bridge antispoof=yes')
antispoof=yes将激活dom0防火墙以防止domU使用dom0不允许使用的IP做为地址,比如domU使用网关作为其IP,这可能会引起网络混乱。另外这一功能还需要在domU的配置文件中指定domUIP,比如:
vif = ['vifname=vbox1,ip=192.168.1.50']
这将决定domU只能使用192.168.1.50这个IP,使用其它IP,将会被dom0的防火墙阻止
(network-script network-bridge)默认就打开了这个功能,如果关闭这一功能,则更改为
(network-script 'network-bridge antispoof=no')
打开migration功能,修改/etc/xen/xend-config.sxp中的以下参数
(xend-relocation-server yes)
(xend-relocation-port 8002)
(xend-relocation-address '')
(xend-relocation-hosts-allow '192.168.1.15 192.168.1.16')
# sed -i.bak "s/.*(network-script network-bridge)$/(network-script network-bridge)/g" /etc/xen/xend-config.sxp
# sed -i.bak "s/^#(xend-relocation-server no)$/(xend-relocation-server yes)/g" /etc/xen/xend-config.sxp
# sed -i.bak "s/^#(xend-relocation-port 8002)$/(xend-relocation-port 8002)/g" /etc/xen/xend-config.sxp
# sed -i.bak "s/^#(xend-relocation-address '')$/(xend-relocation-address '')/g" /etc/xen/xend-config.sxp
# sed -i.bak "s/^#(xend-relocation-hosts-allow '')$/(xend-relocation-hosts-allow '192.168.1.15 192.168.1.16')/g" /etc/xen/xend-config.sxp
# scp /etc/xen/xend-config.sxp 192.168.1.16:/etc/xen/
修改/etc/hosts,添加
192.168.1.15 vbox1
192.168.1.16 vbox2
# cat >/etc/hosts<<-EOF
127.0.0.1 localhost
192.168.1.15 vbox1
192.168.1.16 vbox2
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
EOF
# scp /etc/hosts 192.168.1.16:/etc/
# ssh vbox2 -- reboot
# reboot && exit
以上Dom0的安装及参数调整,同时应用于vbox1(192.168.1.15)和vbox2(192.168.1.16)
DRBD
# apt-get install drbd8-utils
# ssh vbox2 -- apt-get install drbd8-utils
# lvcreate -L8G -n mail_root vg1
# lvcreate -L100G -n mail_var vg1
# lvcreate -L2G -n mail_swap vg1
# ssh vbox2 -- apt-get install drbd8-utils
# ssh vbox2 -- lvcreate -L8G -n mail_root vg1
# ssh vbox2 -- lvcreate -L100G -n mail_var vg1
# ssh vbox2 -- lvcreate -L2G -n mail_swap vg1
# cat >/etc/drbd.d/mail.res<<-EOF
resource mail_root {
on vbox1 {
device /dev/drbd0;
disk /dev/vg1/mail_root;
address 192.168.1.15:7790;
meta-disk internal;
}
on vbox2 {
device /dev/drbd0;
disk /dev/vg1/mail_root;
address 192.168.1.16:7790;
meta-disk internal;
}
net {
allow-two-primaries;
}
syncer {
verify-alg md5;
}
}
resource mail_var {
on vbox1 {
device /dev/drbd1;
disk /dev/vg1/mail_var;
address 192.168.1.15:7791;
meta-disk internal;
}
on vbox2 {
device /dev/drbd1;
disk /dev/vg1/mail_var;
address 192.168.1.16:7791;
meta-disk internal;
}
net {
allow-two-primaries;
}
syncer {
verify-alg md5;
}
}
resource mail_swap {
on vbox1 {
device /dev/drbd2;
disk /dev/vg1/mail_swap;
address 192.168.1.15:7792;
meta-disk internal;
}
on vbox2 {
device /dev/drbd2;
disk /dev/vg1/mail_swap;
address 192.168.1.16:7792;
meta-disk internal;
}
net {
allow-two-primaries;
}
syncer {
verify-alg md5;
}
}
EOF
把该文件同步到vbox2
# scp /etc/drbd.d/mail.res vbox2:/etc/drbd.d/
# /etc/init.d/drbd start (vbox1,vbox2两上节点都要运行)
Create device metadata. This step must be completed only on initial device creation. It initializes DRBD's metadata
# drbdadm create-md mail_root
# drbdadm create-md mail_var
# drbdadm create-md mail_swap
# ssh vbox2 -- drbdadm create-md mail_root <<<yes
# ssh vbox2 -- drbdadm create-md mail_var <<<yes
# ssh vbox2 -- drbdadm create-md mail_swap <<<yes
Attach to backing device. This step associates the DRBD resource with its backing device:
# drbdadm attach mail_root
# drbaddm attach mail_var
# drbdadm attach mail_swap
Set synchronization parameters. This step sets synchronization parameters for the DRBD resource
# drbdadm syncer mail_root
# drbdadm syncer mail_var
# drbdadm syncer mail_swap
Connect to peer. This step connects the DRBD resource with its counterpart on the peer node
# drbdadm connect mail_root
# drbdadm connect mail_var
# drbdadm connect mail_swap
# ssh vbox2 -- drbdadm up mail_root
# ssh vbox2 -- drbdadm up mail_var
# ssh vbox2 -- drbdadm up mail_swap
TIP: You may collapse the steps drbdadm attach, drbdadm syncer, and drbdadm connect into one, by using the shorthand command drbdadm up
# cat /proc/drbd
Select an initial sync source. If you are dealing with newly-initialized, empty disk, this choice is entirely arbitrary. If one of your nodes already has valuable data that you need to preserve, however, it is of crucial importance that you select that node as your synchronization source. If you do initial device synchronization in the wrong direction, you will lose that data. Exercise caution.
Start the initial full synchronization. This step must be performed on only one node, only on initial resource configuration, and only on the node you selected as the synchronization source. To perform this step, issue this command:
# drbdadm -- --overwrite-data-of-peer primary mail_root
# drbdadm -- --overwrite-data-of-peer primary mail_var
# drbdadm -- --overwrite-data-of-peer primary mail_swap
# drbdsetup /dev/drbd0 syncer -r 110M
# drbdsetup /dev/drbd1 syncer -r 110M
# drbdsetup /dev/drbd2 syncer -r 110M
# cat /proc/drbd
以上命令和文件配置,除特别注明处,都要在vbox2和vbox2上执行
# mkfs.ext3 /dev/drbd0
# mkfs.ext3 /dev/drbd1
# mkswap /dev/drbd2
PV DomU
xen的一大特色就是采用了PV半虚拟技术,这一方面提高了性能,另一方面让硬件不支持虚拟的实现虚拟化成为可能,安装DomU的方法有很多,这里演示最容易的最可靠的安装方法
# wget http://mirrors.163.com/debian/di ... netboot/xen/vmlinuz
# wget http://mirrors.163.com/debian/di ... tboot/xen/initrd.gz
# cat /etc/xen/scripts/hotplugpath.sh
#!/bin/bash
#
# CAUTION: this script is manually created
# see: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591456
# it should go away with xen-common 4.1.0~rc6-1
#
SBINDIR="/usr/sbin"
BINDIR="/usr/bin"
#LIBEXEC="/usr/lib/xen/bin"
LIBEXEC="/usr/lib/xen-4.0/bin"
LIBDIR="/usr/lib64"
#LIBDIR="/usr/lib"
SHAREDIR="/usr/share"
PRIVATE_BINDIR="/usr/lib64/xen-4.0/bin"
#PRIVATE_BINDIR="/usr/lib/xen-4.0/bin"
#XENFIRMWAREDIR="/usr/lib/xen/boot"
XENFIRMWAREDIR="/usr/lib/xen-4.0/boot"
XEN_CONFIG_DIR="/etc/xen"
XEN_SCRIPT_DIR="/etc/xen/scripts"
# scp /etc/xen/scripts/hotplugpath.sh vbox2:/etc/xen/scripts/
Caution:注意hotplugpath.sh这个文件,debian6没有打包进来,必须手动建立,否则xen不能处理DomU中drbd:mail_root格式的磁盘格式
创建DomU mail的配置文件/etc/xen/mail
# cat /etc/xen/mail
name = "mail"
memory = "1024"
disk = ['drbd:mail_root,xvda,w','drbd:mail_var,xvdb,w','drbd:mail_swap,xvdc,w']
#disk = ['phy:/dev/drbd0,xvda,w','phy:/dev/drbd1,xvdb,w','phy:/dev/drbd2,xvdc,w']
kernel = "/tmp/vmlinuz"
ramdisk = "/tmp/initrd.gz"
vif = ['vifname=mail,ip=192.168.1.21']
on_reboot = 'restart'
on_crash = 'destroy'
# xm create mail -c
象真机一样安装,注意分区的时候用现成的分区,不要破坏分区结构,可以格式化,安装完成之后,domU会自动重启,另外domU的IP地址必须设定为domU配置文件中设定的192.168.1.21,不然将不能联网
# xm destroy mail
修改 /etc/xen/mail
name = "mail"
memory = "1024"
disk = ['drbd:mail_root,xvda,w','drbd:mail_var,xvdb,w','drbd:mail_swap,xvdc,w']
#disk = ['phy:/dev/drbd0,xvda,w','phy:/dev/drbd1,xvdb,w','phy:/dev/drbd2,xvdc,w']
bootloader = "/usr/bin/pygrub"
vif = ['vifname=mail,ip=192.168.1.21']
on_reboot = 'restart'
on_crash = 'destroy'
# xm create mail -c
# scp /etc/xen/mail vbox2:/etc/xen/
# xm migrate --live mail vbox2
Corosync and Pacemaker
# echo "deb http://backports.debian.org/debian-backports squeeze-backports main">>/etc/apt/sources.list
# scp /etc/apt/sources.list vbox2:/etc/apt/
# apt-get update
# ssh vbox2 -- apt-get update
# apt-get -t squeeze-backports install pacemaker corosync dlm-pcmk gfs-pcmk gfs2-tools clvm -y
# ssh vbox2 -- apt-get -t squeeze-backports install pacemaker corosync dlm-pcmk gfs-pcmk gfs2-tools clvm -y
修改/etc/corosync/corosync.conf,添加或修改如下
compatibility: whitetank
interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 192.168.1.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
clustername: mycluster
}
aisexec {
user: root
group: root
}
修改/etc/default/corosync
START=yes
# for f in /etc/corosync/corosync.conf /etc/default/corosync; do scp $f vbox2f; done
/etc/corosync/authkey (此文件需要分发到各节点,各节点共用同一份authkey)
远程登录时,利用corosync-keygen生成authkey,可以通过产生io操作来较有效产生entropy, 如 dd if=/dev/zero of=bigfile count=51200 bs=10240(执行多次)
# for i in $(seq 20); do dd if=/dev/zero of=/root/bigfile bs=512M count=2 oflag=direct; done
# corosync-keygen
# chmod 400 /etc/corosync/authkey
# scp /etc/corosync/authkey vbox2:/etc/corosync/
# ssh vbox2 -- chmod 400 /etc/corosync/authkey
# /etc/init.d/corosync start (vbox1,vbox2两个结点)
# crm status
============
Last updated: Fri May 25 15:22:43 2012
Last change: Fri May 25 15:07:02 2012 via crmd on vbox1
Stack: openais
Current DC: vbox1 - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ vbox1 vbox2 ]
# crm
crm(live)# configure
crm(live)configure#
crm(live)configure# property stonith-enabled=false
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# commit
crm(live)configure#primitive mail ocf:heartbeat:Xen \
params xmfile="/etc/xen/mail" \
op monitor interval="10s" \
op start interval="0s" timeout="30s" \
op stop interval="0s" timeout="300s" \
meta allow-migrate="true" target-role="Started"
crm(live)configure# commit
crm(live)configure# exit
# crm resource move mail vbox2
# crm status
============
Last updated: Fri May 25 15:36:05 2012
Last change: Fri May 25 15:35:39 2012 via crm_resource on vbox1
Stack: openais
Current DC: vbox1 - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ vbox1 vbox2 ]
mail (ocf::heartbeat:Xen): Started vbox2
HVM DomU
修改 /etc/lvm/lvm.conf
locking_type = 3
# filtering for DRBD:
filter = [ "a|drbd.*|", "a|sda.*|", "r|.*|" ]
# scp /etc/lvm/lvm.conf vbox2:/etc/lvm/
# lvcreate -L200G -ngfs2mirror vg1
# ssh vbox2 -- lvcreate -L200G -ngfs2mirror vg1
建立/etc/drbd.d/gfs2mirror.res
resource gfs2mirror {
on vbox1 {
device /dev/drbd10;
disk /dev/vg1/gfs2mirror;
address 192.168.1.15:7800;
meta-disk internal;
}
on vbox2 {
device /dev/drbd10;
disk /dev/vg1/gfs2mirror;
address 192.168.1.16:7800;
meta-disk internal;
}
net {
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
syncer {
verify-alg md5;
}
startup {
become-primary-on both;
}
}
# scp /etc/drbd.d/gfs2mirror.res vbox2 -- /etc/drbd.d/
# drbdadm create-md gfs2mirror
# ssh vbox2 -- drbdadm create-md gfs2mirror <<<yes
# drbdadm up gfs2mirror
# ssh vbox2 -- drbdadm up gfs2mirror
# drbdadm -- --overwrite-data-of-peer primary gfs2mirror
# drbdsetup /dev/drbd10 syncer -r 110M
# ssh vbox2 -- drbdadm primary gfs2mirror
# mkfs.gfs2 -p lock_dlm -j 2 -t pcmk:gfs2mirror /dev/drbd/by-res/gfs2mirror
primitive p_drbd_gfs2mirror ocf:linbit:drbd \
params drbd_resource="gfs2mirror" \
op monitor interval="10" role="Master" \
op monitor interval="30" role="Slave"
ms ms_drbd_gfs2mirror p_drbd_gfs2mirror \
meta notify="true" master-max="2" interleave="true"
primitive p_controld ocf:pacemaker:controld \
params daemon="dlm_controld.pcmk" \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10"
primitive p_gfs2_controld ocf:pacemaker:controld \
params daemon="gfs_controld.pcmk" \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10"
group g_gfs2mgmt p_controld p_gfs2_controld
clone cl_gfs2mgmt g_gfs2mgmt \
meta interleave="true"
primitive p_fs_gfs2mirror ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/gfs2mirror" directory="/gfs2mirror" fstype="gfs2" options="rw,noatime"
clone cl_fs_gfs2mirror p_fs_gfs2mirror \
meta target-role="Started"
colocation c_gfs2mirror inf: cl_fs_gfs2mirror cl_gfs2mgmt ms_drbd_gfs2mirror:Master
order o_gfs2mirror inf: ms_drbd_gfs2mirror:promote cl_gfs2mgmt:start cl_fs_gfs2mirror:start
# service corosync restart
# ssh vbox2 -- corosync restart
# mkdir /gfs2mirror/{image,iso,xen}
创建/gfs2mirror/xen/ad1
kernel='/usr/lib/xen-4.0/boot/hvmloader'
builder='hvm'
name='ad1'
device_model='/usr/lib/xen-4.0/bin/qemu-dm'
disk=['file:/gfs2mirror/image/ad1.img,hda,w','file:/gfs2mirror/iso/en_windows_server_2008_datacenter_enterprise_standard_sp2_x86_dvd_342333.iso,hdc:cdrom,r']
#disk=['file:/gfs2mirror/image/ad1.img,hda,w']
boot='dc'
#boot='c'
memory='2048'
shadow_memory='8'
vcpus=1
#vif=['type=ioemu,model=e1000,mac=00:21:41:e2:31:04,bridge=eth0']
vif=['type=ioemu,mac=00:21:41:e2:31:04,bridge=eth0']
on_poweroff='destroy'
on_reboot='restart'
on_crash='restart'
vnc=1
vnclisten='0.0.0.0'
vncdisplay=0
vncunused=1
vncpasswd='pass'
# ln -s /gfs2mirror/xen/ad1 /etc/xen/ad1
# ssh vbox2 -- ln -s /gfs2mirror/xen/ad1 /etc/xen/ad1
制作HVM硬盘镜像文件
# qemu-img create -f raw /gfs2mirror/image/ad1.img 50G
# ls /gfs2mirror/image -l
total 4
-rw-r--r-- 1 root root 53687091200 May 25 17:10 ad1.img
# du /gfs2mirror/image
8 /gfs2mirror/image
# xm create ad1
vncviewer远程安装系统,完成后,修改/gfs2mirror/xen/ad1
kernel='/usr/lib/xen-4.0/boot/hvmloader'
builder='hvm'
name='ad1'
device_model='/usr/lib/xen-4.0/bin/qemu-dm'
#disk=['file:/gfs2mirror/image/ad1.img,hda,w','file:/gfs2mirror/iso/en_windows_server_2008_datacenter_enterprise_standard_sp2_x86_dvd_342333.iso,hdc:cdrom,r']
disk=['file:/gfs2mirror/image/ad1.img,hda,w']
#boot='dc'
boot='c'
memory='2048'
shadow_memory='8'
vcpus=1
#vif=['type=ioemu,model=e1000,mac=00:21:41:e2:31:04,bridge=eth0']
vif=['type=ioemu,mac=00:21:41:e2:31:04,bridge=eth0']
on_poweroff='destroy'
on_reboot='restart'
on_crash='restart'
vnc=1
vnclisten='0.0.0.0'
vncdisplay=0
vncunused=1
vncpasswd='pass'
primitive ad1 ocf:heartbeat:Xen \
params xmfile="/gfs2mirror/xen/ad1" \
op monitor interval="10s" \
op start interval="0s" timeout="30s" \
op stop interval="0s" timeout="300s" \
meta allow-migrate="true" target-role="Started"
注:HVM在线迁移失败,此法似乎不可行,HVM DomU迁移失败,并且在原来节点重新启动,但是HVM DomU不纳入pacemaker管理,用xm做在线迁移,却是成功的
|
|