Re: Add LZ4 compression in pg_dump

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: gkokolatos(at)pm(dot)me, Michael Paquier <michael(at)paquier(dot)xyz>, shiy(dot)fnst(at)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Rachel Heaton <rachelmheaton(at)gmail(dot)com>
Subject: Re: Add LZ4 compression in pg_dump
Date: 2023-03-10 13:05:49
Message-ID: ZAsrLTWyz5w0KCRl@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 09, 2023 at 06:58:20PM +0100, Tomas Vondra wrote:
> I'm a bit confused about the lz4 vs. lz4f stuff, TBH. If we switch to
> lz4f, doesn't that mean it (e.g. restore) won't work on systems that
> only have older lz4 version? What would/should happen if we take backup
> compressed with lz4f, an then try restoring it on an older system where
> lz4 does not support lz4f?

You seem to be thinking about LZ4F as a weird, new innovation I'm
experimenting with, but compress_lz4.c already uses LZ4F for its "file"
API. LZ4F is also what's written by the lz4 CLI tool, and I found that
LZ4F has been included in the library for ~8 years:

https://github.com/lz4/lz4/releases?page=2
r126 Dec 24, 2014
New : lz4frame API is now integrated into liblz4

> Maybe if lz4f format is incompatible with regular lz4, we should treat
> it as a separate compression method 'lz4f'?
>
> I'm mostly afk until the end of the week, but I tried searching for lz4f
> info - the results are not particularly enlightening, unfortunately.
>
> AFAICS this only applies to lz4f stuff. Or would the streaming mode be a
> breaking change too?

Streaming mode outputs the same format as the existing code, but gives
better compression. We could (theoretically) change it in a bugfix
release, and old output would still be restorable (I think new output
would even be restorable with the old versions of pg_restore).

But that's not true for LZ4F. The benefit there is that it avoids
outputing a separate block for each row. That's essential for narrow
tables, for which the block header currently being written has an
overhead several times larger than the data.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2023-03-10 13:30:07 Re: [PATCH] Add pretty-printed XML output option
Previous Message Peter Eisentraut 2023-03-10 12:43:01 Re: Add standard collation UNICODE