Skip site navigation (1) Skip section navigation (2)

Exporting Snapshots

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: pgsql-cluster-hackers(at)postgresql(dot)org
Cc: Joachim Wieland <joe(at)mcknight(dot)de>
Subject: Exporting Snapshots
Date: 2010-02-06 07:50:38
Message-ID: 4B6D1F4E.7070104@bluegap.ch (view raw or flat)
Thread:
Lists: pgsql-cluster-hackers
Hi,

the very first item on the ClusterFeatures [1] wishlist is "Export 
snapshots to other sessions". Joachim Wieland has recently sent in a 
patch to hackers [1] which he called "Synchronized Snapshots". To me 
that sounded similar enough to review it.

That patch doesn't really "export" a snapshot, but rather just tries to 
make sure the transactions start with the same snapshot. They can then 
do whatever they want, including writing and committing or aborting 
whenever they want.

But for any kind of parallel querying (be it on the same or across 
multiple nodes) we need to be able to export a snapshot of a transaction 
to another backend - from any point in time of the origin transaction.

This includes the full XIP array (list of transactions in progress at 
the time of snapshot creation) as well as making sure the data that's 
already written (but uncommitted) by that transaction is available to 
the destination backend (which is a no-op for a single node, but needs 
care for remote backends).

Additionally, some access controlling information needs to be 
transferred, to ensure parallel querying isn't a security hole. 
Joachim's patch currently circumvents this issue by requiring superuser 
privileges.

A worker backend for parallel querying should never need to write any 
data, so it should be forced into read-only mode. And I'd say the origin 
transaction should not be allowed to continue with another query before 
having "collected" all worker backends that attached to its snapshot. So 
we have yet another difference to Joachim's approach: continuing 
independently or being bound to the origin transaction.

I realize this is not quite the same as what Joachim has in mind for 
parallel pg_dump. It seems to be a more general approach, which 
certainly also requires more work. However, I think it could fit the 
requirements of a parallel pg_dump as well.

Cluster hackers, is this a good summary which covers your needs as well? 
Something that's still missing?

Joachim, would you be willing to work on such a more general approach?

Regards

Markus Wanner

[1]: feature wish list of cluster hackers:
http://wiki.postgresql.org/wiki/ClusterFeatures

[2]: Synchronized Snapshots, by Joachim Wieland
http://archives.postgresql.org/message-id/dc7b844e1001081136k12ae4eq6d1f7689ed1adfe6@mail.gmail.com

Responses

pgsql-cluster-hackers by date

Next:From: Markus WannerDate: 2010-02-06 08:08:45
Subject: Re: PgCon: who will be there?
Previous:From: Greg SmithDate: 2010-02-06 07:42:34
Subject: Re: PgCon: who will be there?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group