Re: Add LZ4 compression in pg_dump

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: gkokolatos(at)pm(dot)me
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, shiy(dot)fnst(at)fujitsu(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-03-01 17:08:17
Message-ID: 20230301170817.GA4268@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 01, 2023 at 01:39:14PM +0000, gkokolatos(at)pm(dot)me wrote:
> On Wednesday, March 1st, 2023 at 12:58 AM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> > The current function order avoids 3 lines of declarations, but it's
> > obviously pretty useful to be able to run that diff command. I already
> > argued for not calling the functions "Gzip" on the grounds that the name
> > was inaccurate.
>
> I have no idea why we are back on the naming issue. I stand by the name
> because in my humble opinion helps the code reader. There is a certain
> uniformity when the compression_spec.algorithm and the compressor
> functions match as the following code sample shows.

I mentioned that it's because this allows usefully running "diff"
against the previous commits.

> if (compression_spec.algorithm == PG_COMPRESSION_NONE)
> InitCompressorNone(cs, compression_spec);
> else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
> InitCompressorGzip(cs, compression_spec);
> else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
> InitCompressorLZ4(cs, compression_spec);
>
> When the reader wants to see what happens when the PG_COMPRESSION_XXX
> is set, has to simply search for the XXX part. I think that this is
> justification enough for the use of the names.

You're right about that.

But (with the exception of InitCompressorGzip), I'm referring to the
naming of internal functions, static to gzip.c, so renaming can't be
said to cause a loss of clarity.

> > I'd want to create an empty large object in src/test/sql/largeobject.sql
> > to exercise this tested during pgupgrade. But unfortunately that
> > doesn't use -Fc, so this isn't hit. Empty input is an important enough
> > test case to justify a tap test, if there's no better way.
>
> Please find in the attached a test case that exercises this codepath.

Thanks for writing it.

This patch could be an opportunity to improve the "diff" output, without
renaming anything.

The old order of functions was:
-InitCompressorZlib
-EndCompressorZlib
-DeflateCompressorZlib
-WriteDataToArchiveZlib
-ReadDataFromArchiveZlib

If you put DeflateCompressorEnd immediately after DeflateCompressorInit,
diff works nicely. You'll have to add at least one declaration, which
seems very worth it.

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-03-01 17:25:03 Re: refactoring relation extension and BufferAlloc(), faster COPY
Previous Message Andres Freund 2023-03-01 17:02:00 Re: refactoring relation extension and BufferAlloc(), faster COPY