I've been toying with the idea of implementing database replication for the last few days. The system I'm proposing will be a seperate program which can be run on any machine and will most likely be implemented in Python. What I'm looking for at this point are gaping holes in my thinking/logic/etc. Here's what I'm thinking...

1) I want to make this program an additional layer over PostgreSQL. I really don't want to hack server code if I can get away with it. At this point I don't feel I need to.

2) The replication system will need to add at least one field to each table in each database that needs to be replicated. This field will be a date/time stamp which identifies the "last update" of the record. This field will be called PGR_TIME for lack of a better name. Because this field will be used from within programs and triggers it can be longer so as to not mistake it for a user field.

3) For each table to be replicated the replication system will programatically add one plpgsql function and trigger to modify the PGR_TIME field on both UPDATEs and INSERTs. The name of this function and trigger will be along the lines of <table_name>_replication_update_trigger and <table_name>_replication_update_function. The function is a simple two-line chunk of code to set the field PGR_TIME equal to NOW. The trigger is called before each insert/update. When looking at the Docs I see that times are stored in Zulu (GT) time. Because of this I don't have to worry about time zones and the like. I need direction on this part (such as "hey dummy, look at page N of file X.").

4) At this point we have tables which can, at a basic level, tell the replication system when they were last updated.

5) The replication system will have a database of its own to record the last replication event, hold configuration, logs, etc. I'd prefer to store the configuration in a PostgreSQL table but it could just as easily be stored in a text file on the filesystem somewhere.

6) To handle replication I basically check the local "last replication time" and compare it against the remote PGR_TIME fields. If the remote PGR_TIME is greater than the last replication time then change the local copy of the database, otherwise, change the remote end of the database. At this point I don't have a way to know WHICH field changed between the two replicas so either I do ROW level replication or I check each field. I check PGR_TIME to determine which field is the most current. Some fine tuning of this process will have to occur no doubt.

7) The commandline utility, fired off by something like cron, could run several times during the day -- command line parameters can be implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER B.

Questions/Concerns:

1) How far do I go with this? Do I start manhandling the system catalogs (pg_* tables)?

2) As to #2 and #3 above, I really don't like tools automagically changing my tables but at this point I don't see a way around it. I guess this is where the testing comes into play.

3) Security: the replication app will have to have pretty good rights to the database so it can add the nessecary functions and triggers, modify table schema, etc.

So, any "you're insane and should run home to momma" comments?

Damond