Re: Transaction Snapshot Cloning

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 20:39:04
Message-ID: 1200083944.4266.1249.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2008-01-11 at 15:05 -0500, Tom Lane wrote:

> the whole thing gives me the willies.

Me too :-)

> > What I'm thinking about is how we might use this to have multiple
> > sessions working simultaneously on tasks like unloading data,
>
> Then what you want is a function that says "clone the snapshot of that
> specified other transaction".

That's exactly what I want.

I was thinking of a few use cases:

1. parallel query, where multiple backends work on parts of one query

2. parallel unload, where multiple backends work on different tables
that form part of the same set of tables to be unloaded

> Not a function that lets the user
> substitute random snapshot data and tell you he thinks it's valid.
> The user isn't going to have any legal way to transfer the data between
> backends anyway, since no transaction can see results of an uncommitted
> other transaction. There *has* to be some backdoor channel involved
> there, and you might as well make it carry the data without the user
> touching it.
>
> The whole thing seems a bit backwards anyway. What you'd really want
> for ease of use is some kind of "fork this session" operation, that
> is push the info to a new process not pull it.

For (1) I think a "fork this session" operation sounds right.

For (2) I definitely want to connect multiple times and yet have all
sessions see the same snapshot. Yes we want multiple backends, but we
also want multiple paths to the client.

For (2) there's a very simple way of transferring the data between
sessions:
a) we connect on session 1 as a serializable transaction
b) we ask session 1 for its snapshot
c) we then connect on session 2 as a serializable transaction
d) we then execute "select replace_serializable_snapshot(...)"

We already have everything in place to do a), b) and c)

So yes, its a backdoor channel, via a single client with multiple
sessions and the xact datatype.

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > If we had a function
> > replace_serializable_snapshot(master_xid, txid_snapshot)
> > this would allow us to use the txid_snapshot values to replace our
> > transaction's serializable snapshot.
>
> ... whereupon we'd get wrong answers. Certainly you could not allow
> transaction xmin to go backwards, and I'm not sure what other
> restrictions there would be, but the whole thing gives me the willies.

So sure it gives me the willies, but I don't see a wrong answer there.
We're not looking for a general time-travel utility, I just want to
connect and run a COPY TO operation that sees the same data that another
session sees. Nothing fancy, so caveats can be as long as your arm as
long as we can run COPY TO on a naked table.

There are uses of that for parallel pg_dump, parallel slony etc..,
helping us upgrade faster to new releases.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-01-11 20:42:42 Re: Transaction Snapshot Cloning
Previous Message Gokulakannan Somasundaram 2008-01-11 20:31:08 Re: Transaction Snapshot Cloning