Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: eric(dot)cyr(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT
Date: 2018-10-23 12:21:31
Message-ID: 18397.1540297291@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

I wrote:
> =?utf-8?q?PG_Bug_reporting_form?= <noreply(at)postgresql(dot)org> writes:
>> SELECT * FROM test_file_fdw_program_limit LIMIT 0;
>> /*
>> [38000] ERROR: program "echo "test"" failed Detail: child process exited
>> with exit code 1
>> */

> Yeah, I can reproduce this on macOS as well as Linux. Capturing stderr
> shows something pretty unsurprising:
> sh: line 1: echo: write error: Broken pipe
> So the called program is behaving in a somewhat reasonable way: it's
> detecting EPIPE on its stdout (after we close the pipe), reporting that,
> and doing exit(1).
> Unfortunately, it's not clear what we could do about that, short of
> always reading the whole program output, which is likely to be a
> cure worse than the disease. If the program were failing thanks to
> SIGPIPE, we could recognize that as a case we can ignore ... but with
> behavior like this, I don't see a reliable way to tell it apart from
> a generic program failure, which surely we'd better report.

After a bit of thought, the problem here is blindingly obvious:
we generally run the backend with SIGPIPE handing set to SIG_IGN,
and evidently popen() allows the called program to inherit that,
at least on these platforms.

So what we need to do is make sure the called program inherits SIG_DFL
handling for SIGPIPE, and then special-case that result as not being
a failure. The attached POC patch does that and successfully makes
the file_fdw problem go away for me.

It's just a POC because there are some things that need more thought
than I've given them:

1. Is it OK to revert SIGPIPE to default processing for *all* programs
launched through OpenPipeStream? If not, what infrastructure do we
need to add to control that? In particular, is it sane to revert
SIGPIPE for a pipe program that we will write to not read from?
(I thought probably yes, because that is the normal Unix behavior,
but it could be debated.)

2. Likewise, is it always OK for ClosePipeToProgram to ignore a
SIGPIPE failure? (For ordinary COPY, maybe it shouldn't, since
we don't intend to terminate that early.)

3. Maybe this should be implemented at some higher level?

4. Are there any other signals we ought to be reverting to default
behavior before launching a COPY TO/FROM PROGRAM?

regards, tom lane

Attachment Content-Type Size
process-sigpipe-normally-in-copy-from-program-1.patch text/x-patch 2.7 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Langote 2018-10-23 13:44:30 Re: BUG #15448: server process (PID 22656) was terminated by exception 0xC0000005
Previous Message Andrew Dunstan 2018-10-23 11:46:01 Re: BUG #15448: server process (PID 22656) was terminated by exception 0xC0000005

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2018-10-23 12:43:40 Re: WAL archive (archive_mode = always) ?
Previous Message Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?= 2018-10-23 12:17:08 Re: [PATCH] Tab complete EXECUTE FUNCTION for CREATE (EVENT) TRIGGER