PostgreSQL database replication

Default featured post

In PostgreSQL 9.1 and later database replication feature is provided built in. This feature allows you to have high up time and very low risk of data lose. It is also useful for data mirroring and have high usage for load balancing (For example, very useful in designing banking system). To know more please refer to this link. All you need to do is to configure your PostgreSQL for database replication. In the first glance, it sounds very difficult but once you have done it, you would realize it is not difficult at all and it is a matter of spending half an hour to do all.

Fortunately, official PostgreSQL site provided very great documentation for database replication in terms of applying the configuration in practice and definition of various db replication concepts for better understanding and knowing how it works. This made my task easier as I don’t need to repeat everything here. To start with, you can take a look at links on resource section, specially the wiki PostgreSQL one and follow one of them step by step as instructed.

In my case during db replication configuration I have faced with various problems which I have summarized list of them with solutions, hope it will be useful for others as well.

Enabling PostgreSQL user to ssh and sftp

By right you cannot ssh to PostgreSQL user from another PC, however, to connect master db to slave, it is required to open up that feature to copy ssh certificate, transfer base files and most importantly transfer WAL log of master db.

Don’t worry about abusing of opening up ssh for PostgreSQL because you can minimize the impact by securing it via ssh certificate.

Anyway, to open up ssh port for any user in Linux, you need to edit /etc/ssh/sshd_config file and add below content.

Port [PortNumber] (e.g.22)
PermitRootLogin no
AllowUsers [Username] (e.g. postgres)

For further understanding regarding the parameter please refer to this link.

Then you need to restart ssh service with below command.

$ sudo service ssh reload

Resolve problem of slave could not receive data from WAL stream

It might happen that after you have done all the configuration correctly in both servers, you can’t use psql command in slave because it is giving you a message similar to below,

FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000000F00000052 has already been removed

The reason for this error is that slave can’t get specific WAL from master.

To overcome the problem you need to follow these steps as mentioned in this link.

On master server

  1. Set max_wal_senders=5
  2. Set wal_keep_segments=4000. Yes I know it is very high, but I’d like to monitor the situation and see what happens. I have room on the master.

One slave server

  1. Save configuration files in the data directory (i.e. pg_hba.conf pg_ident.conf postgresql.conf recovery.conf)
  2. Clear out the data directory (e.g. rm -rf /var/lib/pgsql/9.3/data/*) . This seems to be required by pg_basebackup.
  3. Run the following command: pg_basebackup -h master -D /var/lib/pgsql/9.3/data --username=replication --password

Then finally run both psql again but slave first always.

Getting PostgreSQL log during runtime

Actually this is a very useful command to monitor the process of starting PostgreSQL to see whether any problems occur or not. For instance, due to one misconfiguration in my case, I was not able to start PostgreSQL and I have spent quite sometimes to troubleshoot and find a way to monitor the starting process. Finally I have found the following command which helped me a lot to identify the problem.

$ pg_ctl -D /var/lib/pgsql/data start

If you face any issue you can comment here so I can help you on this matter.

Resources