Re: No error checking when reading from file using zstd in pg_dump

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Evgeniy Gorbanev <gorbanyoves(at)basealt(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: No error checking when reading from file using zstd in pg_dump
Date: 2025-06-16 15:41:10
Message-ID: D011594F-7FA3-4BE4-BF89-B1AB18CD3456@yesql.se
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 16 Jun 2025, at 16:20, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
>>> On 16 Jun 2025, at 15:56, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I've not checked to see what the other users of this API do, but
>>> if they're all like this then we need to fix that comment.
>
>> AFAICT all other callers of this API are throwing an error with pg_fatal, and
>> so does the function in question for ZStd decompression errors.
>
> I think I agree that we need to handle the ferror() case inside the
> read_func for consistency. But there is another problem, which is
> that neither of its callers are paying attention to the API: as
> defined, the read_func can never return anything but "true",
> which is how Zstd_read behaves. But both the callers in
> compress_zstd.c seem to think they should test its result to
> detect EOF.

Attached is a stab at fixing both the error handling in read_func as well as
the callers misuse of the API.

> Apparently, our tests do not cover the case of an
> unexpected EOF.

I admittedly ran out of steam when looking at adding something like this to our
pg_dump tests.

> This API is actually quite bizarrely defined, if you ask me.
> Returning the byte count is optional, but if you don't pay
> attention to the byte count you cannot know if you got any
> data. At best, that's bug-encouraging.

Agreed. Given that many of implementations in the gzip support code directly
return the gzXXX function I suspect it was modeled around GZip but thats an
unresearched guess. The fact that this returns Z_NULL where the API defines
NULL is unlikely to ever be an issue, but it also doesn't seem entirely neat..

--
Daniel Gustafsson

Attachment Content-Type Size
0001-pg_dump-Handle-errors-in-reading-ZStd-streams.patch application/octet-stream 2.7 KB
unknown_filename text/plain 1 byte

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2025-06-16 15:43:54 Re: CHECKPOINT unlogged data
Previous Message Nathan Bossart 2025-06-16 15:18:41 Re: CHECKPOINT unlogged data