利用keepalived做redis的双机热备

应用场景:
服务压力不大,但是对可靠性要求极其高
特性:
至少需要两台服务器,其中一台为master始终提供服务,另外一台只有在主服务器挂掉的才提供服务。
优点:数据同步非常简单,不像负载均衡对数据一致性要求非常高
缺点:服务器有点浪费,始终有一台处于空闲状态

说明:
操作系统:CentOS 6.3 64位
redis服务器:172.27.4.38、172.27.4.39

架构规划:
需要在Master与Slave上均安装Keepalived和Redis,并且redis开启本地化策略。
master:172.27.4.38
slave:172.27.4.39
Virtural IP Address (VIP): 172.27.4.250

当 Master 与 Slave 均运作正常时, Master负责服务,Slave负责Standby;
当 Master 挂掉,Slave 正常时, Slave接管服务,同时关闭主从复制功能;
当 Master 恢复正常,则从Slave同步数据,同步数据之后关闭主从复制功能,恢复Master身份。与此同时Slave等待Master同步数据完成之后,恢复Slave身份。

redis安装步骤:
wget http://download.redis.io/releases/redis-2.4.15.tar.gz
tar zxvf redis-2.4.15.tar.gz
mkdir -p /usr/local/redis2.4.15/etc
mkdir -p /usr/local/redis2.4.15/data
mkdir -p /usr/local/redis2.4.15/logs

mv redis.conf /usr/local/redis2.4.15/etc
cd redis-2.4.15
make PREFIX=/usr/local/redis2.4.15 install
cd src/
make install

安装完毕
tree /usr/local/redis2.4.15/
/usr/local/redis2.4.15/
├── bin
│   ├── redis-benchmark
│   ├── redis-check-aof
│   ├── redis-check-dump
│   ├── redis-cli
│   └── redis-server
├── etc
└── redis.conf
2 directories, 6 files

修改redis配置文件

vim /usr/local/redis2.4.15/redis.conf
daemonize yes #当为yes时redis回后台运行,关闭命令为redis-cli shutdown
logfile /usr/local/redis2.4.15/logs/redis.log #指定日志文件目录
dbfilename dump.rdb #指定数据文件名称
dir /usr/local/redis2.4.15/data/ #指定数据文件文件夹未知

开启redis的命令为:
/usr/local/redis2.4.15/bin/redis-server /usr/local/redis2.4.15/etc/redis.conf
至此redis安装完毕

然后用客户端测试一下是否启动成功。

redis-cli
redis> set foo bar
OK
redis> get foo
"bar"

Keepalived安装
官方地址:http://www.keepalived.org/download.html 大家可以到这里下载最新版本。
环境配置:安装make 和 gcc openssl openssl-devel popt ipvsadm 等
yum -y install gcc make openssl openssl-devel wget kernel-devel
cd /usr/local/src/
wget http://www.keepalived.org/software/keepalived-1.2.13.tar.gz
tar zxvf keepalived-1.2.13.tar.gz
cd keepalived-1.2.13
预编译如果指定目录的话
./configure --prefix=/usr/local/keepalived
也可以直接使用默认
./configure
预编译后出现:
Keepalived configuration
------------------------
Keepalived version : 1.2.13
Compiler : gcc
Compiler flags : -g -O2
Extra Lib : -lssl -lcrypto -lcrypt
Use IPVS Framework : Yes
IPVS sync daemon support : Yes
IPVS use libnl : No
fwmark socket support : Yes
Use VRRP Framework : Yes
Use VRRP VMAC : Yes
SNMP support : No
SHA1 support : No
Use Debug flags : No

make && make install

整理管理文件:
ln -s /usr/local/keepalived/sbin/keepalived /usr/sbin/
ln -s /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
ln -s /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
至此keepalived安装完毕
注册为服务并设置开机启动
chkconfig --add keepalived
chkconfig keepalived on
service keepalived start

ps -ef | grep keepalived
root 1332 1908 0 09:32 pts/0 00:00:00 grep keepalived
root 1977 1 0 Nov20 ? 00:00:02 keepalived -D
root 1978 1977 0 Nov20 ? 00:00:02 keepalived -D
root 1979 1977 0 Nov20 ? 00:00:17 keepalived -D
redis服务启动后会有3个进程。

首先,在Master上创建如下配置文件:
mkdir -p /etc/keepalived/scripts/
vim /etc/keepalived/keepalived.conf

vrrp_script chk_redis {
script "/etc/keepalived/scripts/redis_check.sh" ###监控脚本
interval 2 ###监控时间
}
vrrp_instance VI_1 {
state MASTER ###设置为MASTER
interface eth0 ###监控网卡
virtual_router_id 51
priority 101 ###权重值
authentication {
auth_type PASS ###加密
auth_pass redis ###密码
}
track_script {
chk_redis ###执行上面定义的chk_redis
}
virtual_ipaddress {
172.27.4.250 ###VIP
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh

然后,在Slave上创建如下配置文件:

mkdir -p /etc/keepalived/scripts/
vim /etc/keepalived/keepalived.conf

vrrp_script chk_redis {
script "/etc/keepalived/scripts/redis_check.sh" ###监控脚本
interval 2 ###监控时间
}
vrrp_instance VI_1 {
state BACKUP ###设置为BACKUP
interface eth0 ###监控网卡
virtual_router_id 51
priority 100 ###比MASTRE权重值低
authentication {
auth_type PASS
auth_pass redis ###密码与MASTRE相同
}
track_script {
chk_redis ###执行上面定义的chk_redis
}
virtual_ipaddress {
172.27.4.250 ###VIP
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}

在Master和Slave上创建监控Redis的脚本:
vim /etc/keepalived/scripts/redis_check.sh
#!/bin/bash

ALIVE=`/opt/redis/bin/redis-cli PING`
if [ "$ALIVE" == "PONG" ]; then
echo $ALIVE
exit 0
else
echo $ALIVE
exit 1
fi

编写以下负责运作的关键脚本:
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh

因为Keepalived在转换状态时会依照状态来呼叫:
当进入Master状态时会呼叫notify_master
当进入Backup状态时会呼叫notify_backup
当发现异常情况时进入Fault状态呼叫notify_fault
当Keepalived程序终止时则呼叫notify_stop

首先,在Redis Master上创建notity_master与notify_backup脚本:

vim /etc/keepalived/scripts/redis_master.sh
#!/bin/bash

REDISCLI="/usr/local/redis2.8.9/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE 2>&1

echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 172.27.4.39 6379 >> $LOGFILE 2>&1
sleep 10
#延迟10秒以后待数据同步完成后再取消同步状态

echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1

vim /etc/keepalived/scripts/redis_backup.sh

#!/bin/bash

REDISCLI="/usr/local/redis2.8.9/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1

sleep 15
#延迟15秒待数据被对方同步完成之后再切换主从角色
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 172.27.4.39 6379 >> $LOGFILE 2>&1

接着,在Redis Slave上创建notity_master与notify_backup脚本:

vim /etc/keepalived/scripts/redis_master.sh

#!/bin/bash

REDISCLI="/usr/local/redis2.8.9/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE 2>&1

echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 172.27.4.38 6379 >> $LOGFILE 2>&1
sleep 10
#延迟10秒以后待数据同步完成后再取消同步状态

echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1

vim /etc/keepalived/scripts/redis_backup.sh

#!/bin/bash

REDISCLI="/usr/local/redis2.8.9/bin/redis-cli"
LOGFILE="/var/log/keepalived-redis-state.log"

echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1

sleep 15
#延迟15秒待数据被对方同步完成之后再切换主从角色
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 172.27.4.38 6379 >> $LOGFILE 2>&1

然后在Master与Slave创建如下相同的脚本:

vim /etc/keepalived/scripts/redis_fault.sh
#!/bin/bash

LOGFILE=/var/log/keepalived-redis-state.log

echo "[fault]" >> $LOGFILE
date >> $LOGFILE

vim /etc/keepalived/scripts/redis_stop.sh
#!/bin/bash

LOGFILE=/var/log/keepalived-redis-state.log

echo "[stop]" >> $LOGFILE
date >> $LOGFILE

给脚本都加上可执行权限:
chmod +x /etc/keepalived/scripts/*.sh

脚本创建完成以后,我们开始按照如下流程进行测试:
1.启动Master上的Redis
/etc/init.d/redis restart

2.启动Slave上的Redis
/etc/init.d/redis restart

3.启动Master上的Keepalived
/etc/init.d/keepalived restart

4.启动Slave上的Keepalived
/etc/init.d/keepalived restart

5.尝试通过VIP连接Redis:
redis-cli -h 172.27.4.250 INFO

连接成功,Slave也连接上来了。
role:master
slave0:172.27.4.39,6379,online

6.尝试插入一些数据:
redis-cli -h 172.27.4.250 SET Hello Redis
OK

从VIP读取数据
redis-cli -h 172.27.4.250GET Hello
“Redis”

从Master读取数据
redis-cli -h 172.27.4.38 GET Hello
“Redis”

从Slave读取数据
redis-cli -h 172.27.4.39 GET Hello
“Redis”

下面,模拟故障产生:
将Master上的Redis进程杀死:
killall -9 redis-server
或者redis-cli shutdown

查看Master上的Keepalived日志
tail -f /var/log/keepalived-redis-state.log
[fault]
Thu Sep 27 08:29:01 CST 2012

同时Slave上的日志显示:
tailf /var/log/keepalived-redis-state.log
[master]
Fri Sep 28 14:14:09 CST 2012
Being master….
Run SLAVEOF cmd …
OK
Run SLAVEOF NO ONE cmd …
OK

然后我们可以发现,Slave已经接管服务,并且担任Master的角色了。
redis-cli -h 172.27.4.250 INFO
redis-cli -h 172.27.4.39 INFO
role:master

然后我们恢复Master的Redis进程
/etc/init.d/redis start

查看Master上的Keepalived日志
tailf /var/log/keepalived-redis-state.log
[master]
Thu Sep 27 08:31:33 CST 2012
Being master….
Run SLAVEOF cmd …
OK
Run SLAVEOF NO ONE cmd …
OK

同时Slave上的日志显示:
tailf /var/log/keepalived-redis-state.log
[backup]
Fri Sep 28 14:16:37 CST 2012
Being slave….
Run SLAVEOF cmd …
OK

可以发现目前的Master已经再次恢复了Master的角色,故障切换以及自动恢复都成功了