Skip site navigation (1) Skip section navigation (2)

Re: Transaction Snapshot Cloning

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 20:39:04
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Fri, 2008-01-11 at 15:05 -0500, Tom Lane wrote:

> the whole thing gives me the willies.

Me too :-)

> > What I'm thinking about is how we might use this to have multiple
> > sessions working simultaneously on tasks like unloading data,
> Then what you want is a function that says "clone the snapshot of that
> specified other transaction".  

That's exactly what I want.

I was thinking of a few use cases:

1. parallel query, where multiple backends work on parts of one query

2. parallel unload, where multiple backends work on different tables
that form part of the same set of tables to be unloaded

> Not a function that lets the user
> substitute random snapshot data and tell you he thinks it's valid.
> The user isn't going to have any legal way to transfer the data between
> backends anyway, since no transaction can see results of an uncommitted
> other transaction.  There *has* to be some backdoor channel involved
> there, and you might as well make it carry the data without the user
> touching it.
> The whole thing seems a bit backwards anyway.  What you'd really want
> for ease of use is some kind of "fork this session" operation, that
> is push the info to a new process not pull it.

For (1) I think a "fork this session" operation sounds right.

For (2) I definitely want to connect multiple times and yet have all
sessions see the same snapshot. Yes we want multiple backends, but we
also want multiple paths to the client.

For (2) there's a very simple way of transferring the data between
a) we connect on session 1 as a serializable transaction
b) we ask session 1 for its snapshot
c) we then connect on session 2 as a serializable transaction
d) we then execute "select replace_serializable_snapshot(...)"

We already have everything in place to do a), b) and c)

So yes, its a backdoor channel, via a single client with multiple
sessions and the xact datatype.

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > If we had a function 
> > 	replace_serializable_snapshot(master_xid, txid_snapshot)
> > this would allow us to use the txid_snapshot values to replace our
> > transaction's serializable snapshot.
> ... whereupon we'd get wrong answers.  Certainly you could not allow
> transaction xmin to go backwards, and I'm not sure what other
> restrictions there would be, but the whole thing gives me the willies.

So sure it gives me the willies, but I don't see a wrong answer there.
We're not looking for a general time-travel utility, I just want to
connect and run a COPY TO operation that sees the same data that another
session sees. Nothing fancy, so caveats can be as long as your arm as
long as we can run COPY TO on a naked table.

There are uses of that for parallel pg_dump, parallel slony etc..,
helping us upgrade faster to new releases.

  Simon Riggs

In response to


pgsql-hackers by date

Next:From: Simon RiggsDate: 2008-01-11 20:42:42
Subject: Re: Transaction Snapshot Cloning
Previous:From: Gokulakannan SomasundaramDate: 2008-01-11 20:31:08
Subject: Re: Transaction Snapshot Cloning

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group