Re: patch for parallel pg_dump

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch for parallel pg_dump
Date: 2012-02-08 03:21:04
Message-ID: CACw0+134_wy9FbAbKobuHyMggd=tu=9kwtpF-0oeaUBc59w11Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 7, 2012 at 4:59 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> It turns out that (as you anticipated) there are some problems with my
> previous proposal.

I assume you're talking to Tom, as my powers of anticipation are
actually quite limited... :-)

> This is not
> quite enough to get rid of g_conn, but it's close: the major stumbling
> block at this point is probably exit_nicely().  The gyrations we're
> going through to make sure that AH->connection gets closed before
> exiting are fairly annoying; maybe we should invent something in
> dumputils.c along the line of the backend's on_shmem_exit().

Yeah, this becomes even more important with parallel jobs where you
want all worker processes die once the parent exits. Otherwise some 10
already started processes would continue to dump your 10 largest
tables for the next few hours with the master process long dead... All
while you're about to start up the next master process...

In my patch I dealt with exactly the same problem for the error
handler by creating a separate function that has a static variable (a
pointer to the ParallelState). The value is set and retrieved through
the same function, so yes, it's kinda global but then again it can
only be accessed from this function, which is only called from the
error handler.

> I'm starting to think it might make sense to press on with this
> refactoring just a bit further and eliminate the distinction between
> Archive and ArchiveHandle.

How about doing more refactoring after applying the patch, you'd then
see what is really needed and then we'd also have an actual use case
for more than one connection (You might have already guessed that this
proposal is heavily influenced by my self-interest of avoiding too
much work to make my patch match your refactoring)...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-02-08 03:24:05 Re: 16-bit page checksums for 9.2
Previous Message Bruce Momjian 2012-02-08 03:10:38 Re: Progress on fast path sorting, btree index creation time