Re: BufFileRead() error signalling

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BufFileRead() error signalling
Date: 2020-01-30 20:38:22
Message-ID: CAAKRu_Zd4PkFdG_a+1Yq0O3n_2ENfSR_6rQrsUZ26hYY4SJEQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 27, 2020 at 7:09 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Rather than answering your actual question, I'd like to complain about
> this:
>
> if (BufFileRead(file, ptr, BLCKSZ) != BLCKSZ)
> - elog(ERROR, "could not read temporary file: %m");
> + elog(ERROR, "could not read temporary file");
>
> I recognize that your commit message explains this change by saying
> that this code will now never be reached except as a result of a short
> read, but I don't think that will necessarily be clear to future
> readers of those code, or people who get the error message. It seems
> like if we're going to do do this, the error messages ought to be
> changed to complain about a short read rather than an inability to
> read for unspecified reasons. However, I wonder why we don't make
> BufFileRead() throw all of the errors including complaining about
> short reads. I would like to confess my undying (and probably
> unrequited) love for the following code from md.c:
>
> errmsg("could not read block
> %u in file \"%s\": read only %d of %d bytes",
>
>
It would be cool to have the block number in more cases in error
messages. For example, in sts_parallel_scan_next()

/* Seek and load the chunk header. */
if (BufFileSeekBlock(accessor->read_file, read_page) != 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not read from shared tuplestore temporary
file"),
errdetail_internal("Could not seek to next block.")));

I'm actually in favor of having the block number in this error
message. I think it would be helpful for multi-batch parallel
hashjoin. If a worker reading one SharedTuplestoreChunk encounters an
error and another worker on a different SharedTuplstoreChunk of the
same file does not, I would find it useful to know more about which
block it was on when the error was encountered.

--
Melanie Plageman

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-01-30 20:45:36 Re: standby apply lag on inactive servers
Previous Message Tom Lane 2020-01-30 20:29:37 Re: Hash join not finding which collation to use for string hashing