Parallel pg_dump's error reporting doesn't work worth squat

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Parallel pg_dump's error reporting doesn't work worth squat
Date: 2015-12-23 18:16:55
Message-ID: 2458.1450894615@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I was in process of testing the proposed patch for bug #13727,
and I found that at least on my Linux box, this is the behavior
in the failure case without the patch:

$ pg_dump "postgres://postgres:phonypassword(at)localhost/regression" --jobs=9 -Fd -f testdump
$ echo $?
141
$ ls testdump
toc.dat

That is, the pg_dump process has crashed with a SIGPIPE without printing
any message whatsoever, and without coming anywhere near finishing the
dump.

A bit of investigation says that this is because somebody had the bright
idea that worker processes could report fatal errors back to the master
process instead of just printing them to stderr. So when the workers
fail to establish connections (because of the password problem cited in
#13727), they don't tell me about that. Oh no, they send those errors
back up to the pipe to the parent, and then die silently. Meanwhile,
the parent is trying to send them commands, and since it doesn't protect
itself against SIGPIPE on the command pipes, it crashes without ever
reporting anything. If you aren't paying close attention, you wouldn't
even realize you didn't get a completed dump.

Depending on timing, this scheme might accidentally fail to fail, but it
seems fragile as can be. I would bet that it's prone to deadlocks, quite
aside from the SIGPIPE problem. Considering how amazingly ugly the
underlying code is (exit_horribly is in parallel.c now? Really?), I want
to rip it out entirely, not try to band-aid it by suppressing SIGPIPE ---
though likely we need to do that too.

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-12-23 18:20:03 Re: SET SESSION AUTHORIZATION superuser limitation.
Previous Message Corey Huinker 2015-12-23 18:15:36 Re: [POC] FETCH limited by bytes.