Proposal: Snapshot cloning

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal: Snapshot cloning
Date: 2007-01-26 03:19:06
Message-ID: 45B9732A.10201@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Granted this one has a few open ends so far and I'd like to receive some
constructive input on how to actually implement it.

The idea is to clone an existing serializable transactions snapshot
visibility information from one backend to another. The semantics would
be like this:

backend1: start transaction;
backend1: set transaction isolation level serializable;
backend1: select pg_backend_pid();
backend1: select publish_snapshot(); -- will block

backend2: start transaction;
backend2: set transaction isolation level serializable;
backend2: select clone_snapshot(<pid>); -- will unblock backend1

backend1: select publish_snapshot();

backend3: start transaction;
backend3: set transaction isolation level serializable;
backend3: select clone_snapshot(<pid>);

...

This will allow a number of separate backends to assume the same MVCC
visibility, so that they can query independent but the overall result
will be according to one consistent snapshot of the database.

What I try to accomplish with this is to widen a bottleneck, many
current Slony users are facing. The initial copy of a database is
currently limited to one single reader to copy a snapshot of the data
provider. With the above functionality, several tables could be copied
in parallel by different client threads, feeding separate backends on
the receiving side at the same time.

The feature could also be used by a parallel version of pg_dump as well
as data mining tools.

The cloning process needs to make sure that the clone_snapshot() call is
made from the same DB user in the same database as corresponding
publish_snapshot() call was done. Since publish_snapshot() only
publishes the information, it gained legally and that is visible in the
PGPROC shared memory (xmin, xmax being the crucial part here), there is
no risk of creating a snapshot for which data might have been removed by
vacuum already.

What I am not sure about yet is what IPC method would best suit the
transfer of the arbitrarily sized xip vector. Ideas?

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-01-26 04:13:20 Re: [GENERAL] Autovacuum Improvements
Previous Message Jan Wieck 2007-01-26 02:01:29 Re: Proposal: Commit timestamp