Re: modification time & transaction synchronisation problem

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Ostrovsky Eugene <e79ene(at)yandex(dot)ru>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: modification time & transaction synchronisation problem
Date: 2010-04-19 08:51:07
Message-ID: 4BCC197B.9010202@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ostrovsky Eugene wrote:
> Hi.
> I need to export data from the database to external file. The difficulty
> is that only data modified or added since previous export should be
> written to the file.
> I consider adding "modification_time" timestamp field to all the tables
> that should be exported. Then I can set this field to now() within ON
> UPDATE OR INSERT trigger.
> During export I can select modified data with 'WHERE modification_time >
> last_export_time' clause.
>
> It seems to be the solution but...
> What if the concurrent (and not yet committed) transaction modified some
> data before export transaction begins? These modifications would not be
> visible to export transaction and modified data would not be included to
> export file. Also it won't be included to the next export because it's
> modification time is less than current export start time (the new value
> of last_export_time).
>
> Thus some data could be lost from export files sequence. And that is not
> good at all.

About the only solid solution I can think of right now is to LOCK TABLE
the table you want to dump. You can use a lockmode that permits SELECT,
but just blocks UPDATE/INSERT/DELETE from other threads. That way your
modification time approach works.

(I strongly suggest leaving the modification time field without an
index, so that HOT can do in-place replacement of the rows and you avoid
a whole lot of bloat).

There might be another possible approach that uses the system
"xmin/xmax" fields of each tuple. That'd permit your incremental dumps
to be done read-only, saving you a whole lot of expensive I/O and bloat.
I'm just not sure what I'm thinking of will work yet. I'll check back in
once I've had a play and written a test script or two.

( If it will, then surely there'd be "pg_dump --incremental-as-of" by
now ...)

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andre Lopes 2010-04-19 08:51:39 How to call a file from a PlPgsql trigger?
Previous Message Craig Ringer 2010-04-19 08:33:41 Re: Help with tracking!