Re: refactoring basebackup.c

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Jeevan Ladhe <jeevanladhe(dot)os(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: refactoring basebackup.c
Date: 2022-03-11 15:19:29
Message-ID: CA+Tgmob6Rnjz-Qv32h3yJn8nnUkLhrtQDAS4y5AtsgtorAFHRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 10, 2022 at 8:02 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> I'm getting errors from pg_basebackup when using both -D- and --compress=server-*
> The issue seems to go away if I use --no-manifest.
>
> $ ./src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method none --compress=server-gzip >/dev/null ; echo $?
> pg_basebackup: error: tar member has empty name
> 1
>
> $ ./src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method none --compress=server-gzip >/dev/null ; echo $?
> NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
> pg_basebackup: error: COPY stream ended before last file was finished
> 1

Thanks for the report. The problem here is that, when the output is
standard output (-D -), pg_basebackup can only produce a single output
file, so the manifest gets injected into the tar file on the client
side rather than being written separately as we do in normal cases.
However, that only works if we're receiving a tar file that we can
parse from the server, and here the server is sending a compressed
tarfile. The current code mistakely attempts to parse the compressed
tarfile as if it were an uncompressed tarfile, which causes the error
messages that you are seeing (and which I can also reproduce here). We
actually have enough infrastructure available in pg_basebackup now
that we could do the "right thing" in this case: decompress the data
received from the server, parse the resulting tar file, inject the
backup manifest, construct a new tar file, and recompress. However, I
think that's probably not a good idea, because it's unlikely that the
user will understand that the data is being compressed on the server,
then decompressed, and then recompressed again, and the performance of
the resulting pipeline will probably not be very good. So I think we
should just refuse this command. Patch for that attached.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
reject-compressed-inject.patch application/octet-stream 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2022-03-11 15:27:52 Re: role self-revocation
Previous Message Ashutosh Sharma 2022-03-11 15:12:26 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints