Re: refactoring basebackup.c (zstd workers)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Jeevan Ladhe <jeevanladhe(dot)os(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: refactoring basebackup.c (zstd workers)
Date: 2022-03-17 20:29:51
Message-ID: CA+TgmoZoFEEbneFumEdq1X6ye_=L-a0G3-bKG36yG18oY9qpcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review!

I'll address most of these comments later, but quickly for right now...

On Thu, Mar 17, 2022 at 3:41 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> It'd be great if this were re-usable for wal_compression, which I hope in pg16 will
> support at least level=N. And eventually pg_dump. But those clients shouldn't
> accept a client/server prefix. Maybe the way to handle that is for those tools
> to check locationres and reject it if it was specified.
> [...]
> This is confusingly similar to src/include/access/xlog.h:WalCompression.
> I think someone else mentioned this before ?

A couple of people before me have had delusions of grandeur in this
area. We have the WalCompression enum, which has values of the form
COMPRESSION_*, instead of WAL_COMPRESSION_*, as if the WAL were going
to be the only thing that ever got compressed. And pg_dump.h also has
a CompressionAlgorithm enum, with values like COMPR_ALG_*, which isn't
great naming either. Clearly there's some cleanup needed here: if we
can use the same enum for multiple systems, then it can have a name
implying that it's the only game in town, but otherwise both the enum
name and the corresponding value need to use a suitable prefix. I
think that's a job for another patch, probably post-v15. For now I
plan to do the right thing with the new names I'm adding, and leave
the existing names alone. That can be changed in the future, if and
when it seems sensible.

As I said elsewhere, I think the WAL compression stuff is badly
designed and should probably be rewritten completely, maybe to reuse
the bbstreamer stuff. In that case, WalCompressionMethod would
probably go away entirely, making the naming confusion moot, and
picking up zstd and lz4 compression support for free. If that doesn't
happen, we can probably find some way to at least make them share an
enum, but I think that's too hairy to try to clean up right now with
feature freeze pending.

> The server crashes if I send an unknown option - you should hit that in the
> regression tests.
>
> $ src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp --no-sync --no-manifest --compress=server-lz4:a |wc -c
> TRAP: FailedAssertion("pointer != NULL", File: "../../../../src/include/utils/memutils.h", Line: 123, PID: 8627)
> postgres: walsender pryzbyj [local] BASE_BACKUP(ExceptionalCondition+0xa0)[0x560b45d7b64b]
> postgres: walsender pryzbyj [local] BASE_BACKUP(pfree+0x5d)[0x560b45dad1ea]
> postgres: walsender pryzbyj [local] BASE_BACKUP(parse_bc_specification+0x154)[0x560b45dc5d4f]
> postgres: walsender pryzbyj [local] BASE_BACKUP(+0x43d56c)[0x560b45bc556c]
> postgres: walsender pryzbyj [local] BASE_BACKUP(SendBaseBackup+0x2d)[0x560b45bc85ca]
> postgres: walsender pryzbyj [local] BASE_BACKUP(exec_replication_command+0x3a2)[0x560b45bdddb2]
> postgres: walsender pryzbyj [local] BASE_BACKUP(PostgresMain+0x6b2)[0x560b45c39131]
> postgres: walsender pryzbyj [local] BASE_BACKUP(+0x40530e)[0x560b45b8d30e]
> postgres: walsender pryzbyj [local] BASE_BACKUP(+0x408572)[0x560b45b90572]
> postgres: walsender pryzbyj [local] BASE_BACKUP(+0x4087b9)[0x560b45b907b9]
> postgres: walsender pryzbyj [local] BASE_BACKUP(PostmasterMain+0x1135)[0x560b45b91d9b]
> postgres: walsender pryzbyj [local] BASE_BACKUP(main+0x229)[0x560b45ad0f78]

That's odd - I thought I had tested that case. Will double-check.

> This is interpreted like client-gzip-1; should multiple specifications of
> compress be prohibited ?
>
> | src/bin/pg_basebackup/pg_basebackup --wal-method fetch -Ft -D - -h /tmp --no-sync --no-manifest --compress=server-lz4 --compress=1

They're not now and haven't been in the past. I think the last one
should just win (as it apparently does, here). We do that in some
places and throw an error in others and I'm not sure if we have a 100%
consistent rule for it, but flipping one location between one behavior
and the other isn't going to make things more consistent overall.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2022-03-17 20:33:41 Re: proposal: enhancing plpgsql debug API - returns text value of variable content
Previous Message Stephen Frost 2022-03-17 20:14:29 Re: Proposal: Support custom authentication methods using hooks