Re: refactoring basebackup.c

From: Jeevan Ladhe <jeevanladhe(dot)os(at)gmail(dot)com>
To: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Abhijit Menon-Sen <ams(at)toroid(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: refactoring basebackup.c
Date: 2022-03-15 10:33:05
Message-ID: CANm22Ch_hU-86CCC-HkUAOY+6jYfGp5jJkde97fTu3u7u=PW9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the patch, Dipesh.
I had a look at the patch and also tried to take the backup. I have
following suggestions and observations:

I get following error at my end:

$ pg_basebackup -D /tmp/zstd_bk -Ft -Xfetch --compress=server-zstd:7(at)4
pg_basebackup: error: could not initiate base backup: ERROR: could not
compress data: Unsupported parameter
pg_basebackup: removing data directory "/tmp/zstd_bk"

This is mostly because I have the zstd library version v1.4.4, which
does not have default support for parallel workers. Maybe we should
have a better error, something that is hinting that the parallelism is
not supported by the particular build.

The regression for pg_verifybackup test 008_untar.pl also fails with a
similar error. Here, I think we should have some logic in regression to
skip the test if the parameter is not supported?

+ if (ZSTD_isError(ret))

+ elog(ERROR,

+ "could not compress data: %s",

+ ZSTD_getErrorName(ret));

I think all of this can go on one line, but anyhow we have to improve
the error message here.

Also, just a thought, for the versions where parallelism is not
supported, should we instead just throw a warning and fall back to
non-parallel behavior?

Regards,
Jeevan Ladhe

On Mon, 14 Mar 2022 at 21:41, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com> wrote:

> Hi,
>
> I tried to implement support for parallel ZSTD compression. The
> library provides an option (ZSTD_c_nbWorkers) to specify the
> number of compression workers. The number of parallel
> workers can be set as part of compression parameter and if this
> option is specified then the library performs parallel compression
> based on the specified number of workers.
>
> User can specify the number of parallel worker as part of
> --compress option by appending an integer value after at sign (@).
> (-Z, --compress=[{client|server}-]{gzip|lz4|zstd}[:LEVEL][(at)WORKERS])
>
> Please find the attached patch v1 with the above changes.
>
> Note: ZSTD library version 1.5.x supports parallel compression
> by default and if the library version is lower than 1.5.x then
> parallel compression is enabled only the source is compiled with build
> macro ZSTD_MULTITHREAD. If the linked library version doesn't
> support parallel compression then setting the value of parameter
> ZSTD_c_nbWorkers to a value other than 0 will be no-op and
> returns an error.
>
> Thanks,
> Dipesh
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yura Sokolov 2022-03-15 10:47:17 Re: BufferAlloc: don't take two simultaneous locks
Previous Message Amit Kapila 2022-03-15 10:18:08 Re: Skipping logical replication transactions on subscriber side