Re: zstd compression for pg_dump

From: Jacob Champion <jchampion(at)timescale(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, gkokolatos(at)pm(dot)me, Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: zstd compression for pg_dump
Date: 2023-03-03 21:38:05
Message-ID: CAAWbhmj-pEhpzfMA+ARnq+L3uS1A1AsmNjz9m_Cx=73n4GRZjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 3, 2023 at 10:55 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> Thanks for looking. If your zstd library is compiled with thread
> support, could you also try with :workers=N ? I believe this is working
> correctly, but I'm going to ask for help verifying that...

Unfortunately not (Ubuntu 20.04):

pg_dump: error: could not set compression parameter: Unsupported parameter

But that lets me review the error! I think these error messages should
say which options caused them.

> It'd be especially useful to test under windows, where pgdump/restore
> use threads instead of forking... If you have a windows environment but
> not set up for development, I think it's possible to get cirrusci to
> compile a patch for you and then retrieve the binaries provided as an
> "artifact" (credit/blame for this idea should be directed to Thomas
> Munro).

I should be able to do that next week.

> > With this particular dataset, I don't see much improvement with
> > zstd:long.
>
> Yeah. I this could be because either 1) you already got very good
> comprssion without looking at more data; and/or 2) the neighboring data
> is already very similar, maybe equally or more similar, than the further
> data, from which there's nothing to gain.

What kinds of improvements do you see with your setup? I'm wondering
when we would suggest that people use it.

> I don't want to start exposing lots of fine-granined parameters at this
> point. In the immediate case, it looks like it may require more than
> just adding another parameter:
>
> Note: If windowLog is set to larger than 27,
> --long=windowLog or --memory=windowSize needs to be passed to the
> decompressor.

Hm. That would complicate things.

Thanks,
--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-03-03 21:39:20 Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Previous Message Dmitry Dolgov 2023-03-03 20:17:57 Re: Schema variables - new implementation for Postgres 15