Re: Add LZ4 compression in pg_dump

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: gkokolatos(at)pm(dot)me, shiy(dot)fnst(at)fujitsu(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-03-16 00:20:59
Message-ID: ZBJg6ylg17Yrj4Bm@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
> > Rearrange functions to their original order allowing a cleaner diff to the prior code;
>
> OK. I wasn't very enthusiastic about this initially, but after thinking
> about it a bit I think it's meaningful to make diffs clearer. But I
> don't see much difference with/without the patch. The
>
> git diff --diff-algorithm=minimal -w e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
>
> Produces ~25k diff with/without the patch. What am I doing wrong?

Do you mean 25 kB of diff ? I agree that the statistics of the diff
output don't change a lot:

1 file changed, 201 insertions(+), 570 deletions(-)
1 file changed, 198 insertions(+), 548 deletions(-)

But try reading the diff while looking for the cause of a bug. It's the
difference between reading 50, two-line changes, and reading a hunk that
replaces 100 lines with a different 100 lines, with empty/unrelated
lines randomly thrown in as context.

When the diff is readable, the pg_fatal() also stands out.

> > Change pg_fatal() to an assertion+comment;
>
> Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
> could add such protections against "impossible" stuff to a zillion other
> places and the confusion likely outweighs the benefits.
>
> > Update the commit message and fix a few typos;
>
> Thanks. I don't want to annoy you too much, but could you split the
> patch into the "empty-data" fix and all the other changes (rearranging
> functions etc.)? I'd rather not mix those in the same commit.

I don't know if that makes sense? The "empty-data" fix creates a new
function called DeflateCompressorInit(). My proposal was to add the new
function in the same place in the file as it used to be.

The patch also moves the pg_fatal() that's being removed. I don't think
it's going to look any cleaner to read a history involving the
pg_fatal() first being added, then moved, then removed. Anyway, I'll
wait while the community continues discussion about the pg_fatal().

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-03-16 00:29:22 Re: Allow logical replication to copy tables in binary format
Previous Message Justin Pryzby 2023-03-16 00:14:24 Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode