Re: refactoring basebackup.c

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: refactoring basebackup.c
Date: 2021-11-02 14:32:35
Message-ID: CA+TgmoZvqk7UuzxsX1xjJRmMGkqoUGYTZLDCH8SmU1xTPr1Xig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 2, 2021 at 7:53 AM Jeevan Ladhe
<jeevan(dot)ladhe(at)enterprisedb(dot)com> wrote:
> I have implemented the cleanup callback bbsink_lz4_cleanup() in the attached patch.
>
> Please have a look and let me know of any comments.

Looks pretty good. I think you should work on stuff like documentation
and tests, and I need to do some work on that stuff, too. Also, I
think you should try to figure out how to support different
compression levels. For gzip, I did that by making gzip1..gzip9
possible compression settings. But that might not have been the right
idea because something like lz43 to mean lz4 at level 3 would be
confusing. Also, for the lz4 command line utility, there's not only
"lz4 -3" which means LZ4 with level 3 compression, but also "lz4
--fast=3" which selects "ultra-fast compression level 3" rather than
regular old level 3. And apparently LZ4 levels go up to 12 rather than
just 9 like gzip. I'm thinking maybe we should go with something like
"gzip(at)9" rather than just "gzip9" to mean gzip with compression level
9, and then things like "lz4(at)3" or "lz4(at)fast3" would select either the
regular compression levels or the ultra-fast compression levels.

Meanwhile, I think it's probably OK for me to go ahead and commit
0001-0003 from my patches at this point, since it seems we have pretty
good evidence that the abstraction basically works, and there doesn't
seem to be any value in holding off and maybe having to do a bunch
more rebasing. We may also want to look into making -Fp work with
--server-compression, which would require pg_basebackup to know how to
decompress. I'm actually not sure if this is worthwhile; you'd need to
have a network connection slow enough that it's worth spending a lot
of CPU time compressing on the server and decompressing on the client
to make up for the cost of network transfer. But some people might
have that case. It might make it easier to test this, too, since we
probably can't rely on having an LZ4 binary installed. Another thing
that you probably need to investigate is also supporting client-side
LZ4 compression. I think that is probably a really desirable addition
to your patch set, since people might find it odd if that were
exclusively a server-side option. Hopefully it's not that much work.

One minor nitpick in terms of the code:

+ mysink->bytes_written = mysink->bytes_written + headerSize;

I would use += here.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-11-02 14:35:18 Re: Improve logging when using Huge Pages
Previous Message Tom Lane 2021-11-02 14:21:48 Re: inefficient loop in StandbyReleaseLockList()