database replication

From: DWalker(at)black-oak(dot)com
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: database replication
Date: 1999-12-24 15:27:59
Message-ID: OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

<P>I've been toying with the idea of implementing database replication for the last few days. &nbsp;The system I'm proposing will be a seperate program which can be run on any machine and will most likely be implemented in Python. &nbsp;What I'm looking for at this point are gaping holes in my thinking/logic/etc. &nbsp;Here's what I'm thinking...</P><P>&nbsp;</P><P>1) I want to make this program an additional layer over PostgreSQL. &nbsp;I really don't want to hack server code if I can get away with it. &nbsp;At this point I don't feel I need to.</P><P>2) The replication system will need to add at least one field to each table in each database that needs to be replicated. &nbsp;This field will be a date/time stamp which identifies the &quot;last update&quot; of the record. &nbsp;This field will be called PGR_TIME for lack of a better name. &nbsp;Because this field will be used from within programs and triggers it can be longer so as to not mistake it for a user field.</P><P>3) For each table to be replicated the replication system will programatically add one plpgsql function and trigger to modify the PGR_TIME field on both UPDATEs and INSERTs. &nbsp;The name of this function and trigger will be along the lines of &lt;table_name&gt;_replication_update_trigger and &lt;table_name&gt;_replication_update_function. &nbsp;The function is a simple two-line chunk of code to set the field PGR_TIME equal to NOW. &nbsp;The trigger is called before each insert/update. &nbsp;When looking at the Docs I see that times are stored in Zulu (GT) time. &nbsp;Because of this I don't have to worry about time zones and the like. &nbsp;I need direction on this part (such as &quot;hey dummy, look at page N of file X.&quot;).</P><P>4) At this point we have tables which can, at a basic level, tell the replication system when they were last updated.</P><P>5) The replication system will have a database of its own to record the last replication event, hold configuration, logs, etc. &nbsp;I'd prefer to store the configuration in a PostgreSQL table but it could just as easily be stored in a text file on the filesystem somewhere.</P><P>6) To handle replication I basically check the local &quot;last replication time&quot; and compare it against the remote PGR_TIME fields. &nbsp;If the remote PGR_TIME is greater than the last replication time then change the local copy of the database, otherwise, change the remote end of the database. &nbsp;At this point I don't have a way to know WHICH field changed between the two replicas so either I do ROW level replication or I check each field. &nbsp;I check PGR_TIME to determine which field is the most current. &nbsp;Some fine tuning of this process will have to occur no doubt.</P><P>7) The commandline utility, fired off by something like cron, could run several times during the day -- command line parameters can be implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER B.</P><P>&nbsp;</P><P>Questions/Concerns:</P><P>1) How far do I go with this? &nbsp;Do I start manhandling the system catalogs (pg_* tables)?</P><P>2) As to #2 and #3 above, I really don't like tools automagically changing my tables but at this point I don't see a way around it. &nbsp;I guess this is where the testing comes into play.</P><P>3) Security: the replication app will have to have pretty good rights to the database so it can add the nessecary functions and triggers, modify table schema, etc. &nbsp;</P><P>&nbsp;</P><P>&nbsp; So, any &quot;you're insane and should run home to momma&quot; comments?</P><P>&nbsp;</P><P>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Damond</P><P></P>

Attachment Content-Type Size
unknown_filename text/html 3.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1999-12-24 15:52:50 Re: [HACKERS] Error "vacuum pg_proc"
Previous Message Tom Lane 1999-12-24 15:06:36 Re: [HACKERS] Source code format votes