BufFileRead() error signalling

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: BufFileRead() error signalling
Date: 2019-11-20 21:50:54
Message-ID: CA+hUKGJE04G=8TLK0DLypT_27D9dR8F1RQgNp0jK6qR0tZGWOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

As noted by Amit Khandhekar yesterday[1], BufFileLoad() silently eats
pread()'s error and makes them indistinguishable from EOF.

Some observations:

1. BufFileRead() returns size_t (not ssize_t), so it's an
fread()-style interface, not a read()-style interface that could use
-1 to signal errors. Unlike fread(), it doesn't seem to have anything
corresponding to feof()/ferror()/clearerr() that lets the caller
distinguish between EOF and error, so our hash and tuplestore spill
code simply trusts that if there is a 0 size read where it expects a
tuple boundary, it must be EOF.

2. BufFileWrite() is the same, but just like fwrite(), a short write
must always mean error, so there is no problem here.

3. The calling code assumes that unexpected short read or write sets
errno, which isn't the documented behaviour of fwrite() and fread(),
so there we're doing something a bit different (which is fine, just
pointing it out; we're sort of somewhere between the <stdio.h> and
<unistd.h> functions, in terms of error reporting).

I think the choices are: (1) switch to ssize_t and return -1, (2) add
at least one of BufFileEof(), BufFileError(), (3) have BufFileRead()
raise errors with elog(). I lean towards (2), and I doubt we need
BufFileClear() because the only thing we ever do in client code on
error is immediately burn the world down.

If we simply added an error flag to track if FileRead() had ever
signalled an error, we could change nodeHashjoin.c to do something
along these lines:

- if (nread == 0) /* end of file */
+ if (!BufFileError(file) && nread == 0) /* end of file */

... and likewise for tuplestore.c:

- if (nbytes != 0 || !eofOK)
+ if (BufFileError(file) || (nbytes == 0 && !eofOK))
ereport(ERROR,

About the only advantage to the current design I can see if that you
can probably make your query finish faster by pulling out your temp
tablespace USB stick at the right time. Or am I missing some
complication?

[1] https://www.postgresql.org/message-id/CAJ3gD9emnEys%3DR%2BT3OVt_87DuMpMfS4KvoRV6e%3DiSi5PT%2B9f3w%40mail.gmail.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-11-20 21:52:51 Re: BufFileRead() error signalling
Previous Message Peter Geoghegan 2019-11-20 21:43:05 Why is get_actual_variable_range()'s use of SnapshotNonVacuumable safe during recovery?