Ping a Specific Port

Question

f4nt

Asked: 2009-05-22 22:22:36 +0800 CST2009-05-22 22:22:36 +0800 CST 2009-05-22 22:22:36 +0800 CST

PostgreSQL Replication

772

We constantly bat this around the office, and the question continues to come up. How do you deal with PostgreSQL replication? I'm not even necessarily talking about advanced clusters, just keeping it simple with Master-Slave, Master-MultiSlave, and Master-Master. I find that setting it up for MySQL is typically pretty simple. Failover is straightforward if not perfect, especially for how easy it is to configure. We've played with Slony, but it's a bit too hands on (schema changes require intervention, new databases require intervention, etc). PGPool2 was pretty nice, until a node went down and we couldn't find a graceful way (other than bringing everything down and reseeding the fallen node) to get replication back in sync. Basically here's what I'm typically looking for:

Easy setup (I'll settle for difficult setup, but easy to expand)
Simplistic failover
Bringing a fallen node back in just requires time (i.e. like mysql. Server goes down, you bring it up, and wait for replication to catch up)
Schema changes don't break replication
Adding a new database to the server is seamless (i.e. like mysql, you can replicate a whole DB server, so a new database is created on the master, it automatically propagates to the slave)

MySQL handles most of these fairly well, but I hold a certain fondness for PostgreSQL. Besides, we have some situations where it's our only option, and we'd like to add replication to the mix. What are you using currently, and how do you feel about your solution? This isn't a MySQL versus PostgreSQL post, I promise, because that's not what I'm trying to start. :)

10 Answers

Voted

Michael Renner · Answer 1 · 2009-05-30T07:47:48+08:00

Short answer - there's no such solution yet for PostgreSQL if you need online readonly slaves.

There're two major development projects currently going on in this area which are included in PostgreSQL 9.0 (Spring/Summer 2010), namely:

Synchronous Replication:

http://wiki.postgresql.org/wiki/NTT's_Development_Projects

Read only hot standby slaves:

http://wiki.postgresql.org/wiki/Hot_Standby

which in combination aim to achieve the ease of use of MySQL-style replication minus the bugs/issues MySQL has plus the reliability users know from PostgreSQL.

All of this was kicked off by a manifest from the PostgreSQL Core Team in 2008:

http://archives.postgresql.org/pgsql-hackers/2008-05/msg00913.php

The PostgreSQL replication solutions to this day with the largest user base are Slony-I (more expensive for writes, makes schema changes fiddly), WAL shipping/walmgr (Slaves can't be used online) and pgQ/londiste from Skype/Skytools (more tools/building blocks than a finished solution).

I've written a few things on Log Shipping, walmgr and Slony-I, see

http://blogs.amd.co.at/mt/mt-search.cgi?blog_id=1&tag=pgrep&limit=20 for more information.

Arndt Lehmann · Answer 2 · 2009-07-01T22:41:07+08:00

Arndt Lehmann

2009-07-01T22:41:07+08:002009-07-01T22:41:07+08:00

And to throw another solution into the ring: rubyrep.

To compare with your requirements:

Easy setup
Yes, that's actually the primary focus of rubyrep.
Simplistic failover
Yes. In fact rubyrep does master-master replication - to fail over, no action is necessary at all. Just start using the other database.
Schema changes don't break replication
Yes.
For non-primary key changes replication doesn't even have to stop (but make sure the schema is changes on both sides at the same time)
To add / remove tables, simply restart the replication daemon. Only changing the primary key column of a table takes a bit of effort.
Adding a new database to the server is seamless (i.e. like mysql, you can replicate a whole DB server, so a new database is created on the master, it automatically propagates to the slave)
This is only supported in a limited way: each rubyrep setup replicates only one database at a time. (But it is very easy to set up replication for more than one database.)

5

diq · Answer 3 · 2009-05-28T10:02:30+08:00

diq

2009-05-28T10:02:30+08:002009-05-28T10:02:30+08:00

You didn't mention having a hot read-slave as a requirement, so I'm going to propose using Heartbeat with either shared storage or DRBD. It just does the right thing and administration is a breeze. It's the Linux equivalent of older Microsoft SQL Server clustering. One node is active and the other node is passive while the data is shared between the two. You don't have to worry about SQL-based replication because it's all handled lower down at the block level.

Seriously, it's by far the best solution if you don't need read slaves. The WAL archive stuff was hokey at best and you must set everything up again if you ever disrupt the shipping process for a server reboot. slony and londiste don't cut the mustard. If you want to stay on the main source tree and not go commercial, Heartbeat is your best bet.

4

dpavlin · Answer 4 · 2009-05-23T05:25:04+08:00

dpavlin

2009-05-23T05:25:04+08:002009-05-23T05:25:04+08:00

From your requirements it seems that PITR is easiest way to solve your problem:

On-line backup and point-in-time recovery (PITR)

You didn't say that you need to query slave server, so PITR might be just right.

It's standard part of PostgreSQL from version 8.0 so you probably already have everything needed to get it up and running.

If you find instructions too verbose, take a look at SkyTools WalMgr which will make process of creating/failover to hot-standby data single command task.

For more complex replication scenarios, I had good experience Slony-1, but PostgreSQL has many good replication/HA options available.

2

KevinRae · Answer 5 · 2009-05-28T09:53:26+08:00

KevinRae

2009-05-28T09:53:26+08:002009-05-28T09:53:26+08:00

If you want asynchronous master/slave replication consider Londiste (part of the skytools package from Skype) wiki.postgresql.org/wiki/Londiste_Tutorial

It's easy to install, adding a new DB is easy, replication just "catches up."

Failover is not built-in though. You will need to change your application connection strings or obfuscate the DB connection behind another layer of software.

Some schema changes are easy. Others are more difficult. It depends on your application. The next version of skytools (ver 3.0) is supposed to handle DDL and include facilities to make failover easier.

We moved to Londiste after finding Slony too painful to use.

2

mm2010 · Answer 6 · 2009-05-23T01:09:02+08:00

mm2010

2009-05-23T01:09:02+08:002009-05-23T01:09:02+08:00

See a discussion here, maybe that can help:

http://blog.endpoint.com/2009/05/competitors-to-bucardo-version-1.html

and

Competitors to Bucardo Version One, found lower down on the page:

http://www.planetpostgresql.org/

1

Alex Balashov · Answer 7 · 2009-05-28T02:10:10+08:00

Alex Balashov

2009-05-28T02:10:10+08:002009-05-28T02:10:10+08:00

There really aren't any free / open-source ways to provide what you're looking for. If you want something that is so turn-key, look at various third-party commercial replication solutions.

Now, it is possible to sort of roll your own replication with Postgres using write-head log (WAL) shipping:

http://www.postgresql.org/docs/8.3/interactive/warm-standby.html

This is basically where you can put a secondary node into continuous recovery mode and import transaction logs into it every {small interval}. The Postgres configuration has "stubs" to allow you to do certain things when a Postgres when a WAL is completed and so no, and that's what that setup is predicated on -- utilising those "stubs."

However, that doesn't allow you to do master-master and/or circular replication.

In any case, it definitely works for erdundancy, but I would not call it "easy setup," "simplistic failover," "seamless," or anything like that.

1

Alexey Klyukin · Answer 8 · 2009-05-30T10:18:41+08:00

Alexey Klyukin

2009-05-30T10:18:41+08:002009-05-30T10:18:41+08:00

Except for 'adding a new database' thing you can try Mammoth Replicator (https://projects.commandprompt.com/public/replicator). It's open-source, easy to setup and supports failover. The major limitations are single database and inability to replicate DDL changes, both are in the TODO list.

1

Aleksandar Ivanisevic · Answer 9 · 2009-05-27T10:30:06+08:00

Aleksandar Ivanisevic

2009-05-27T10:30:06+08:002009-05-27T10:30:06+08:00

I'm currently looking at Tungsten replicator, I'm still far away from any definite conclusion, but probably worth a look.

www.continuent.com

0

bbigras · Answer 10 · 2009-05-27T11:28:03+08:00

bbigras

2009-05-27T11:28:03+08:002009-05-27T11:28:03+08:00

Postgres-R looked promising but I don't know if the project is still alive.

0

PostgreSQL Replication

Ping a Specific Port

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What's the command-line utility in Windows to do a reverse DNS look-up?

How to check if a port is blocked on a Windows machine?

What port should I open to allow remote desktop?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?