Getting consistent snapshot in multiple backends, for parallel pg_dump

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Getting consistent snapshot in multiple backends, for parallel pg_dump
Date: 2009-11-07 10:36:41
Message-ID: 4AF54DB9.6040806@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Me & Simon got talking about the difficulty of doing parallel pg_dump,
where when you open multiple connections you must somehow ensure that
all the connections use the same snapshot so that you get a consistent
dump. We came up with a pretty simple way to do that:

1. Open N+1 the connections to the server
2. In one of them, grab ProcArrayLock in shared mode
3. In all other connections, begin a (serializable) transaction.
4. Release ProcArrayLock.

Because we're holding the ProcArrayLock across 2-4, all the connections
get at step 3 will get the same snapshot. That's exactly what we want
for doing a parallel pg_dump.

A difficulty with that is that we need some way to hold a lwlock until
the client tells to release it. You can't hold a lwlock over command
boundaries. But that's surely solvable, e.g by sleeping in the backend
with the lock held until signaled by another backend. With a timeout to
make sure we don't block indefinitely if the client crashes or something.

I'm not planning to do anything with this at the moment, but wanted to
get the idea out there and archived. It would be nice to see someone
implement parallel pg_dump similar to parallel pg_restore using this.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2009-11-07 11:05:41 Re: Getting consistent snapshot in multiple backends, for parallel pg_dump
Previous Message Pavel Stehule 2009-11-07 07:23:56 Re: Specific names for plpgsql variable-resolution control options?