dbmirror revisions

From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: dbmirror revisions
Date: 2003-04-03 23:37:03
Message-ID: 200304031637.03517.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

I've been modifying dbmirror and wanted to offer my changes to anyone that
cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
I make no claims of production readiness, and testing of this latest
version has been minimal, so I strongly advise you to conduct your own
thorough testing before considering a production deployment. That said,
it's a significantly improved solution for our async master-slave needs,
with a few caveats below, and shouldn't be too hard to setup.

There are enough changes that I would hardly consider this a patch, closer
to an overhaul, since I've removed files, renamed others, and added new
files. Among the changes I've made so far:

* Added script for easier setup of many tables/dbs/slaves;
* Added initial support for multiple master replicating distinct data to a
single slave;
* Added batching to minimize load on master and net traffic. You can grab
a configurable number of updates to replicate before hitting the master
again.

* Added port specification;
* Wrapped all replication in transactions;
* Bulletproofed against downed master or slave;
* Started modularization of DB access layer, added some error
handling;
* Added a number of config vars for sync delays, etc;
* Eliminated bug in transaction ordering for replay. Updates cannot
be replicated in the order of the transactions (see archives for discussion
of why).

* Eliminated need for clear_pending.pl by making dbmirror.pl
self-clearing;
* Collasped schema into 1 queue table for performance;
* Changed sequence ID column types to BIGINT for 64-bit sequence;
* Added reconnection handling for robustness;
* Added local tracking of last seq_id to help with recovery
robustness;
* Added master/slave compatibility checking;
* Enabled slave setup during production service so master does not
have to stop serving.
* Renamed tables to minimize namespace conflicts;
* Added lots of logging/debug messages;

* Maybe a few other things I've forgotten...

AFAICS, there are still at least a few major drawbacks to this approach:

* DML statements are not replicated (same for eRServer, AFAIK).

* SEQUENCE objects are not handled; nextval() will not be replicated, so
sequence objects (and serial columns) between master and slave can easily
get out of sync. I wonder if eRServer has this same issue?

* Mass updates/deletes/inserts of 5000 rows with a single SQL command on
the master will result in 5000 individual trigger-firings, and 5000
individual replication inserts on the slave. Rumor has it eRServer's
snapshot gets around this problem.

The code is here:

http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz

Ed

Responses

Browse pgsql-general by date

  From Date Subject
Next Message jack 2003-04-03 23:42:29 Re: php - postgreSQL link module
Previous Message Manfred Koizar 2003-04-03 22:52:32 Re: Single Byte values

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-04-03 23:40:54 Re: Detecting corrupted pages earlier
Previous Message Vincent van Leeuwen 2003-04-03 23:23:20 Re: Detecting corrupted pages earlier