Re: Add LZ4 compression in pg_dump

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: gkokolatos(at)pm(dot)me, shiy(dot)fnst(at)fujitsu(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-03-16 22:30:50
Message-ID: fc130e92-09ea-c5b0-a1a0-29f93c323371@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/16/23 01:20, Justin Pryzby wrote:
> On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
>>> Rearrange functions to their original order allowing a cleaner diff to the prior code;
>>
>> OK. I wasn't very enthusiastic about this initially, but after thinking
>> about it a bit I think it's meaningful to make diffs clearer. But I
>> don't see much difference with/without the patch. The
>>
>> git diff --diff-algorithm=minimal -w e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
>>
>> Produces ~25k diff with/without the patch. What am I doing wrong?
>
> Do you mean 25 kB of diff ?

Yes, if you redirect the git-diff to a file, it's a 25kB file.

> I agree that the statistics of the diff output don't change a lot:
>
> 1 file changed, 201 insertions(+), 570 deletions(-)
> 1 file changed, 198 insertions(+), 548 deletions(-)
>
> But try reading the diff while looking for the cause of a bug. It's the
> difference between reading 50, two-line changes, and reading a hunk that
> replaces 100 lines with a different 100 lines, with empty/unrelated
> lines randomly thrown in as context.
>
> When the diff is readable, the pg_fatal() also stands out.
>

I don't know, maybe I'm doing something wrong or maybe I just am bad at
looking at diffs, but if I apply the patch you submitted on 8/3 and do
the git-diff above (output attached), it seems pretty incomprehensible
to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
to identify the root cause of the bug based on that).

>>> Change pg_fatal() to an assertion+comment;
>>
>> Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
>> could add such protections against "impossible" stuff to a zillion other
>> places and the confusion likely outweighs the benefits.
>>
>>> Update the commit message and fix a few typos;
>>
>> Thanks. I don't want to annoy you too much, but could you split the
>> patch into the "empty-data" fix and all the other changes (rearranging
>> functions etc.)? I'd rather not mix those in the same commit.
>
> I don't know if that makes sense? The "empty-data" fix creates a new
> function called DeflateCompressorInit(). My proposal was to add the new
> function in the same place in the file as it used to be.
>

Got it. In that case I agree it's fine to do that in a single commit.

> The patch also moves the pg_fatal() that's being removed. I don't think
> it's going to look any cleaner to read a history involving the
> pg_fatal() first being added, then moved, then removed. Anyway, I'll
> wait while the community continues discussion about the pg_fatal().
>

I think the agreement was to replace the pg_fatal with and assert, and I
see your patch already does that.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
git-diff.txt text/plain 24.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-03-16 22:58:42 Re: Add LZ4 compression in pg_dump
Previous Message Tom Lane 2023-03-16 22:15:51 Re: A problem about ParamPathInfo for an AppendPath