论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2011-02-22 18:23 |只看该作者 |倒序浏览

七、测试
1. (on master)新建用户
create role digoal nosuperuser login encrypted password 'digoal';
(on slave)查看,比log shipping模式快很多,在master建立好用户后slave马上就可以看到已经复制过来了.
postgres=# \du
                     List of roles
Role name |          Attributes          | Member of
-----------+-----------------------------------+-----------
digoal |                                  | {}
postgres  | Superuser, Create role, Create DB | {}
repuser1  | Superuser                      +| {}
         | 20 connections                   |

2. 新建表空间
on master
su - postgres
mkdir /database/pgdata/tbs3/tbs_digoal
on slave
su - postgres
mkdir /database/pgdata/tbs3/tbs_digoal
on master
create tablespace tbs_digoal owner digoal location '/database/pgdata/tbs3/tbs_digoal';
on slave (查看)
postgres=# \db
               List of tablespaces
Name |  Owner |          Location
------------+----------+----------------------------------
pg_default | postgres |
pg_global  | postgres |
tbs_digoal | digoal | /database/pgdata/tbs3/tbs_digoal
(3 rows)

3. (on master)新建数据库
create database digoal with owner digoal template template0 encoding 'UTF8' tablespace tbs_digoal;
(on slave)查看
postgres=# \l
                           List of databases
Name |  Owner | Encoding | Collation | Ctype | Access privileges
-----------+----------+----------+-----------+-------+-----------------------
digoal | digoal | UTF8    | C       | C    |
postgres  | postgres | UTF8    | C       | C    |
template0 | postgres | UTF8    | C       | C    | =c/postgres       +
         |       |       |          |    | postgres=CTc/postgres
template1 | postgres | UTF8    | C       | C    | =c/postgres       +
         |       |       |          |    | postgres=CTc/postgres
(4 rows)

4. (on master)新建schema
\c digoal digoal
create schema digoal authorization digoal;
(on slave)查看
postgres=# \c digoal digoal
You are now connected to database "digoal" as user "digoal".
digoal=> \dn
      List of schemas
      Name       |  Owner
--------------------+----------
digoal          | digoal
information_schema | postgres
pg_catalog       | postgres
pg_toast          | postgres
pg_toast_temp_1 | postgres
public          | postgres
(6 rows)

5. (on master)新建表
\c digoal digoal
create table tbl_users (id int8 , nick varchar(32));
(on slave)查看
digoal=> \c digoal digoal
You are now connected to database "digoal".
digoal=> \d tbl_users
      Table "digoal.tbl_users"
Column |       Type       | Modifiers
--------+-----------------------+-----------
id    | bigint             |
nick | character varying(32) |

6. (on master)插入测试数据
测试脚本,开50个进程后台插入.
#!/bin/bash
for ((i=0;i<50;i++))
do
psql -h 127.0.0.1 digoal digoal -c "insert into tbl_users select generate_series(1,1000000),'digoal'" &
done

查看主节点,仅仅有一个sender进程在发送数据,一个standby对应一个sender
[root@db-172-16-3-33 ~]# ps -ewf|grep sender
postgres 20921 20622  0 14:56 ?       00:00:02 postgres: wal sender process repuser1 172.16.3.39(18716) streaming 3/70000000
[root@db-172-16-3-33 ~]# netstat -anp|grep 172.16.3.39
tcp       0    0 172.16.3.33:1921          172.16.3.39:18716          ESTABLISHED 20921/EF5BDEC8

查看standby节点,
top
26154 postgres  25 0 2389m 1.0g 1.0g R 100.2 13.3 0:51.64 postgres: startup process recovering 00000001000000030000001D
26159 postgres  15 0 2401m 6052 1652 S 12.9  0.1 0:06.28 postgres: wal receiver process streaming 3/80000000
[root@db-172-16-3-39 ~]# netstat -anp|grep 172.16.3.33
tcp       0    0 172.16.3.39:18716          172.16.3.33:1921          ESTABLISHED 26159/90940000

很快,standby和master最终数据一致.
digoal=> select count(*) from tbl_users ;
  count
----------
50000000
(1 row)

7. (on master)使用DDL测试冲突
on master
alter table tbl_users add column first_name varchar(32) default 'zhou';
alter table tbl_users add column last_name varchar(32) default 'digoal';

on slave 在slave恢复期间,在slave节点执行 select count(*) from tbl_users;发生等待事件.
26232 postgres  18 0 2392m 2.0g 2.0g S  0.0 26.3 0:09.78 postgres: digoal digoal 127.0.0.1(39324) SELECT waiting

on slave /var/applog/pg_log中查看最近一个日志文件,
2011-01-04 15:17:43.268 CST,"digoal","digoal",26232,"127.0.0.1:39324",4d22c709.6678,4,"SELECT waiting",2011-01-04 15:06:49 CST,2/19,0,LOG,00000,"process 26232 still waiting for AccessShareLock on relation 16404 of database 16402 after 1000.302 ms",,,,,,"select count(*) from tbl_users;",22,,"psql"

主节点执行完后,slave节点的waiting很快消失,不会像log shipping模式可能出现继续等待含有SQL结束的WAL的情况.

8. (on master)测试checkpoint
在PostgreSQL中发生checkpoint后,在此之前的WAL在做数据库恢复时就用不到了,因为确保数据都写入数据文件了.
pg_archivecleanup也是根据checkpoint来判断和删除不需要的WAL的.

9. (on slave)测试cleanarchive
在做checkpoint前,去看$PGARCHIVE目录,已经被apply的文件还存在,并没有被pg_archivecleanup命令清除掉,原因就是这些文件是最近一次checkpoint以来的WAL文件,在数据库恢复时是需要用到的.
如果你手工执行pg_archivecleanup $PGARCHIVE 000000010000000200000031 (假设000000010000000200000031这个是在$PGARCHIVE中的一个WAL的文件名)
这条命令将删除000000010000000200000031以前生成的所有WAL文件,一定要小心操作,万一不小心把最近一次CHECKPOINT以来的WAL删除了,
补救的方法是赶紧到master上做一次checkpoint,让slave知道这次checkpoint,否则的话下次slave启动还会读到000000010000000200000031这个文件以前的文件,那时候就只能找到这些文件或重建slave了.

10. (on slave)测试active slave
激活SLAVE很简单,了解到已经apply了最新的WAL后,执行以下
su - postgres
touch /database/pgdata/tbs1/pg_root/postgresql.trigger.1921
数据库会触发激活的动作,激活后/database/pgdata/tbs1/pg_root/postgresql.trigger.1921这个文件会自动删掉,并且recovery.conf被重命名为recovery.done.
激活后的slave不可逆转为slave了.需要重建.

11. (on slave)测试write操作
postgres=# create table tbl_test (id int);
ERROR:  cannot execute CREATE TABLE in a read-only transaction

12. 监控
pg_current_xlog_insert_location
pg_current_xlog_location
pg_last_xlog_receive_location
pg_last_xlog_replay_location
top
CREATE OR REPLACE VIEW pg_stat_replication AS
SELECT
         S.procpid,
         S.usesysid,
         U.rolname AS usename,
         S.application_name,
         S.client_addr,
         S.client_port,
         S.backend_start
FROM pg_stat_get_activity(NULL) AS S, pg_authid U
WHERE S.usesysid = U.oid AND S.datid = 0;

13. 优化
1. Both the WALSender and WALReceiver will work continuously on any outstanding data to be
replicated until the queue is empty. If there is a quiet period, then the WALReceiver will sleep
for 100ms at a time, and the WALSender will sleep for wal_sender_delay. Typically, the
value of wal_sender_delay need not be altered, because it only affects behavior during
momentary quiet periods. The default value is a good balance between effciency and data
protection. If the Master and Standby are connected by a low bandwidth network, and the
write rate on the Master is high, you may wish to lower this value to perhaps 20ms or 50ms.
Reducing this value will reduce the amount of data loss if the Master becomes permanently
unavailable, though will also marginally increase the cost of streaming the transaction log
data to the Standbys.

2. If the connection drops between Master and Standby, it will take some time for that to be
noticed across an indirect network. To ensure that a dropped connection is noticed as soon
as possible, you may wish to adjust the keepalive settings.
If you want a Standby to notice that the connection to the Master has dropped, you need
to set the keepalives in the primary_conninfo in the recovery.conf on the Standby
as follows:
primary_conninfo = '….keepalives_idle= 60 …'
If you want the Master to notice that a streaming Standby connection has dropped, you can
set the keepalive parameters in postgresql.conf on the Master, such as:
tcp_keepalives_idle = 60 # time before we send keepalives
That setting will then apply to all connections from users and replication. If you want to be very
specifc, and just set that for replication, you must supply this as an option to be passed to the
Master, which is specifed like the following:
primary_conninfo = '….options="-c tcp_keepalives_idle= 60" …'
All of the preceding examples set the length of time the connection will be idle before we start
sending keepalives to be 60 seconds. The default is two hours, and is not recommended.
There are multiple keepalive parameters we can set; I have avoided showing those here
for clarity. A related option is connection_timeout. Remember, you can hide all of this
complexity in a connection service fle, so that primary_conninfo only refers to a single
service name, as described in the First Steps chapter.

3. One thing that is a possibility is to set archive_command only until the end of the catch
up period. After that you can reset it to the dummy value ("cd") and then continue just with
streaming replication. Data is only transferred from the Master to the Standby once that data
has been written (or more precisely, fsynced) to disk. So setting synchronous_commit =
off will not improve the replication delay, even if that improves performance on the Master.
Once WAL data is received by the Standby, the WAL data is fsynced to disk on the Standby to
ensure that it is not lost if the Standby system restarts.

4. For streaming replication, the Master keeps a number of fles that is at least wal_keep_
segments. If the Standby database server has been down for long enough, the Master will have
moved on and will no longer have the data for the last point of transfer. If that should occur, then
the Standby needs to be re-confgured using the same procedure with which we started.

5. You may also wish to increase max_wal_senders, so that it will be possible to reconnect
even before a dropped connection is noted; this allows a manual restart to re-establish
connections more easily. If you do this, then also increase the connection limit for the
replication user.
Data transfer may stop because the connection drops or the Standby server or the Standby
system is shutdown. If replication data transfer stops for any reason, it will attempt to restart
from the point of last transfer.

14. 注意事项
1. 清除归档时需要考虑到master-slave是一对多的情况,使用一对多的PGARCHIVE或者是全局的pg_archivecleanup

文库|博客

返回列表

Chinaunix › 论坛 › 数据库技术 › PostgreSQL › PostgreSQL HOT STANDBY using Stream续

PostgreSQL HOT STANDBY using Stream续 [复制链接]

浏览过的版块