Re: patch for parallel pg_dump

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch for parallel pg_dump
Date: 2012-03-14 04:34:10
Message-ID: CACw0+13MaDaJadSMMVdrp8PJVapqUjj-kyca-HXM0qN5G6gKtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 13, 2012 at 1:53 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> What I mean is that the function ArchiveEntry() is defined in
> pg_backup_archiver.c, and it takes an argument called relpages, and
> the string "relpages" does not appear anywhere else in that file.

Uhm, that's kinda concerning, isn't it... fixed...

[...pgpipe...]
> Dunno.  Can we get an opinion on that from one of the Windows guys?
> Andrew, Magnus?

Waiting for the verdict here...

>> If a child terminates without leaving a message, the master will still
>> detect it and just say "a worker process died unexpectedly" (this part
>> was actually broken, but now it's fixed :-) )
>
> All that may be true, but I still don't see why it's right for this to
> apply in the cases where the worker thread says die_horribly(), but
> not in the cases where the worker says exit_horribly().

Hm, I'm not calling the error handler from exit_horribly because it
doesn't have the AH. It looks like the code assumes that
die_horribly() is called whenever AH is available and if not,
exit_horribly() should be called which eventually calls these
preregistered exit-hooks via exit_nicely() to clean up the connection.

I think we should somehow unify both functions, the code is not very
consistent in this respect, it also calls exit_horribly() when it has
AH available. See for example pg_backup_tar.c

Or is there another distinction between them? The question how to
clean it up basically brings us back to the question what to do about
global variables in general and for error handlers in particular.

>> Or we change fmtQualifiedId to take an int and then we always pass the
>> archive version instead of the Archive* ?
>
> Hmm, I think that might make sense.

Done.

>>> +enum escrow_action { GET, SET };
>>> +static void
>>> +parallel_error_handler_escrow_data(enum escrow_action act,
>>> ParallelState *pstate)
>>> +{
>>> +       static ParallelState *s_pstate = NULL;
>>> +
>>> +       if (act == SET)
>>> +               s_pstate = pstate;
>>> +       else
>>> +               *pstate = *s_pstate;
>>> +}
>>>
>>> This seems like a mighty complicated way to implement a global variable.
>>
>> Well, we talked about that before, when you complained that you
>> couldn't get rid of the global g_conn because of the exit handler.
>> You're right that in fact it is an indirect global variable here but
>> it's clearly limited to the use of the error handler and you can be
>> sure that nobody other than this function writes to it or accesses it
>> without calling this function.
>
> Sure, but since all the function does is write to it or access it,
> what good does that do me?

It encapsulates the variable so that it can only be used for one
specific use case.

Attaching a new version.

Attachment Content-Type Size
parallel_pg_dump_4.diff.gz application/x-gzip 33.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vivek Singh Raghuwanshi 2012-03-14 06:08:19 Keystone auth in PostgreSQL
Previous Message Jeff Davis 2012-03-14 04:23:03 Re: initdb and fsync