Re: synchronized snapshots

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: synchronized snapshots
Date: 2010-02-10 18:05:38
Message-ID: 4B72F572.7040501@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Joachim,

On Wed, 10 Feb 2010 11:36:41 +0100, Joachim Wieland <joe(at)mcknight(dot)de>
wrote:
> http://www.postgresql.org/docs/8.4/static/backup-dump.html already
> states about pg_dump: "In particular, it must have read access to all
> tables that you want to back up, so in practice you almost always have
> to run it as a database superuser." so I think there is not a big loss
> here...

Hm.. I doubt somewhat that's common practice. After all, read access to
all tables is still a *lot* less than superuser privileges. But yeah,
the documentation currently states that.

> They more or less get it "by chance" :-) They acquire a snapshot when
> they call pg_synchronize_snapshot_taken()

Oh, I see, calling the function by itself already acquires a snapshot.
Even in case of a fast path call, it seems. Then your approach is correct.

(I'd still feel more comfortable, it I had seen a
GetTransactionSnapshot() or something akin in there).

> and if all the backends do
> it while the other backend holds the lock in shared mode, we know that
> the snapshot won't change, so they all get the same snapshot.

Agreed, that works.

(Ab)using the ProcArrayLock for synchronization is probably acceptable
for pg_dump, however, I'd rather take another approach for a more
general implementation.

>> Also, you should probably ensure the calling transactions don't have a
>> snapshot already (let alone a transaction id).
>
> True...

Hm.. realizing that a function call per-se acquires a snapshot, I fail
to see how we could check if we really acquired a snapshot. Consider the
following (admittedly stupid) example:

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT version();
... time goes by ...
SELECT pg_synchronize_snapshot_taken(..);

As it stands, your function would silently fail to "synchronize" the
snapshots, if other transactions committed in between the two function
calls.

> It seemed more robust and convenient to have an expiration in the
> backend itself. What would happen if you called
> pg_synchronize_snapshots() and if right after that your network
> connection dropped? Without the server noticing, it would continue to
> hold the lock and you could not log in anymore...

Hm.. that's a point. Given this approach uses the ProcArrayLock, it's
probably better to use an explicit timeout.

> But you are right: The proposed feature is a pragmatic and quick
> solution for pg_dump and similar but we might want to have a more
> general snapshot cloning procedure instead. Not having a delay for
> other activities at all and not requiring superuser privileges would
> be a big advantage over what I have proposed.

Agreed.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joachim Wieland 2010-02-10 18:10:40 Re: Parameter name standby_mode
Previous Message Tom Lane 2010-02-10 17:58:18 Re: Some belated patch review for "Buffers" explain analyze patch