Re: refactoring basebackup.c

From: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
To: Jeevan Ladhe <jeevanladhe(dot)os(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: refactoring basebackup.c
Date: 2022-03-04 08:31:50
Message-ID: CAN1g5_H+8ptP9NLzAn1Qx1tu=0Y6Ohp_v5xgYJbpWBC7CwCL8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> > It will be good if we can also fix
> > CreateWalTarMethod to support LZ4 and ZSTD.
> Ok we will see, either Dipesh or I will take care of it.

I took a look at the CreateWalTarMethod to support LZ4 compression
for WAL files. The current implementation involves a 3 step to backup
a WAL file to a tar archive. For each file:

1. It first writes the header in the function tar_open_for_write,
flushes the contents of tar to disk and stores the header offset.
2. Next, the contents of WAL are written to the tar archive.
3. In the end, it recalculates the checksum in function tar_close() and
overwrites the header at an offset stored in step #1.

The need for overwriting header in CreateWalTarMethod is mainly related to
partial WAL files where the size of the WAL file < WalSegSize. The file is
being
padded and checksum is recalculated after adding pad bytes.

If we go ahead and implement LZ4 support for CreateWalTarMethod then
we have a problem here at step #3. In order to achieve better compression
ratio, compressed LZ4 blocks are linked to each other and these blocks
are decoded sequentially. If we overwrite the header as part of step #3 then
it corrupts the link between compressed LZ4 blocks. Although LZ4 provides
an option to write the compressed block independently (using blockMode
option set to LZ4F_blockIndepedent) but it is still a problem because we
don't
know if overwriting the header after recalculating the checksum will not
overlap
the boundary of the next block.

GZIP manages to overcome this problem as it provides an option to turn
on/off
compression on the fly while writing a compressed archive with the help of
zlib
library function deflateParams(). The current gzip implementation for
CreateWalTarMethod uses this library function to turn off compression just
before
step #1 and it writes the uncompressed header of size equal to
TAR_BLOCK_SIZE.
It uses the same library function to turn on the compression for writing
the contents
of the WAL file as part of step #2. It again turns off the compression just
before step
#3 to overwrite the header. The header is overwritten at the same offset
with size
equal to TAR_BLOCK_SIZE.

Since GZIP provides this option to enable/disable compression, it is
possible to
control the size of data we are writing to a compressed archive. Even if we
overwrite
an already written block in a compressed archive there is no risk of it
overlapping
with the boundary of the next block. This mechanism is not available in LZ4
and ZSTD.

In order to support LZ4 and ZSTD compression for CreateWalTarMethod we may
need to refactor this code unless I am missing something. We need to
somehow
add the padding bytes in case of partial WAL before we send it to the
compressed
archive. This will make sure that all files which are being compressed does
not
require any padding as the size is always equal to WalSegSize. There is no
need to
recalculate the checksum and we can avoid overwriting the header as part of
step #3.

Thoughts?

Thanks,
Dipesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-03-04 08:50:20 Re: psql - add SHOW_ALL_RESULTS option
Previous Message Kyotaro Horiguchi 2022-03-04 08:28:45 Re: pg_tablespace_location() failure with allow_in_place_tablespaces