Re: updating remote database

From: Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Kenneth Gonsalves <lawgon(at)thenilgiris(dot)com>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: updating remote database
Date: 2004-02-26 09:55:21
Message-ID: Pine.LNX.4.44.0402261124100.3960-100000@matrix.gatewaynet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

O kyrios Richard Huxton egrapse stis Feb 26, 2004 :

> On Thursday 26 February 2004 02:54, Kenneth Gonsalves wrote:
> > i have a database on a local machine, and another on a remote machine. I
> > have a dialup connection between the two - so band width is a problem. What
> > is the most efficient way of updating the remote from the local? Does SQL
> > or postgres have simple commands for this?
>

You said "update", which i assume is different from dump/restore.

I have made substantial modifications to dbmirror, which now provides
Foreign Key dependency oriented, row grained, conditional, Asynchronous,
Lazy replication.

FK dependency oriented means that rows tables involved as parent in FK
dependencies are replicated only if needed by a child row.

Row grained means the decision of mirroring depends on the
value of the actual row, thus giving the potential to
selectively mirror rows to certain slaves.

Conditional means that a table is either
- not mirrored at all
- unconditionally mirrored to all slaves (you can have many slaves)
- conditionally mirrored based on the value of column.

Async means you can have your updates in .sql text files
(in xaction order), and then transfer them (possibly bzip2'ed) and
execute them to the client.
Also if you have tcp/ip you can call dbmirror in the traditional
way (online), leaving the job of compression to libpq.

Lazy means that you dont need to mirror anything in advance untill
its really needed.

Lets say you have a parent table, and you insert 1m rows.
Then a child table row that is to be mirrored is updated
to point to a row of the parent.
Then only this parent row will be mirrored to your client.
Now if a change is made on this very parent row, then
this change will be also mirrored to the client.

This way you will mirror only the minimal ammount of transactions
needed in order the 2 dbs to be in sync.

I warn you that i dont consider this work fully tested
(altho i think it will surely suit simple setups).
Also a planing phase is mandatory in order to have
the system running correctly.

There is an accounting mechanism that remembers who slave has
which row.

you can see some discussion here:
http://gborg.postgresql.org/pipermail/pgreplication-general/2003-December/001251.html

My version of dbmirror didnt make it to be an applied patch
mainly due to the different goals between the two.

However if you want the code or guidance just ask.

> Sounds like you want some form of (batched) asynchronous replication (as it's
> called). If you've not already set that up though, that won't help here.
>
> Assuming you don't have a complete list of all changes logged somewhere, you
> might want to try:
>
> 1. pg_dump the tables you want to synchronise on the local machine (one per
> file)
> 2. Do the same on the remote machine
> 3. Use rsync to update the remote dump based on the local one
> 4. Restore the updated dump on the remote machine.
>
> Failing that, you might want to look into the replication options available -
> you may be able to adapt contrib/dbmirror, or perhaps erserver/rservimp on
> gborg.postgresql.org
>
>

--
-Achilleus

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Philip J. Boonzaaier 2004-02-26 10:18:57 Triggers
Previous Message Jan Pips 2004-02-26 09:51:41 Field list from table