What would be the best approach for CouchDB replication in the following setup:
A1 and A2 are two CouchDB servers in one DC. They both pull data from each other, although only one is actively used, the other is just a stand by in case of a failure of the first one.
B1 and B2 are similarly set up in terms of replication and are located in a different DC.
What's the best way of achieving A <-> B replication?
I see two options here:
Option 1:
- A1 pulls from B1 and B2
- A2 pulls from B1 and B2
- B1 pulls from A1 and A2
B2 pulls from A1 and A2
A1 pulls from A2
- A2 pulls from A1
- B1 pulls from B2
- B2 pulls from B1
Option 2:
- A1 pulls from B1
- A2 pulls from B2
- B1 pulls from A1
B2 pulls from A2
A1 pulls from A2
- A2 pulls from A1
- B1 pulls from B2
- B2 pulls from B1
IMHO Option 2 is sufficient and covers all bases for the HA setup, ie one way or the other no singe failure would prevent data from being replicated to all 4 DB instances.
There's not much data in there, we're talking about 50-100MB of data max.
Comments welcome. Thanks!
What I'd be looking for is to minimize cross DC operations as:
Then your replication scheme should take into consideration the "safest" path: basically making sure that you replicate data cross centers without impacting your main servers.