Re: pglz performance

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vladimir Leskov <vladimirlesk(at)yandex-team(dot)ru>
Subject: Re: pglz performance
Date: 2019-08-04 22:08:37
Message-ID: 4df61f6c-5800-30f0-7f84-cef4847e8a07@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 04/08/2019 21:20, Andres Freund wrote:
> On 2019-08-04 02:41:24 +0200, Petr Jelinek wrote:
>> Same here.
>>
>> Just so that we don't idly talk, what do you think about the attached?
>
> Cool!
>
>> It:
>> - adds new GUC compression_algorithm with possible values of pglz (default)
>> and lz4 (if lz4 is compiled in), requires SIGHUP
>
> As Tomas remarked, I think it shouldn't be SIGHUP but USERSET. And I
> think lz4 should be preferred, if available. I could see us using a
> list style guc, so we could set it to lz4, pglz, and the first available
> one would be used.
>

Sounds reasonable.

>> - adds 1 byte header to the compressed data where we currently store the
>> algorithm kind, that leaves us with 254 more to add :) (that's an extra
>> overhead compared to the current state)
>
> Hm. Why do we need an additional byte? IIRC my patch added that only
> for the case we would run out of space for compression formats without
> extending any sizes?
>

Yeah your patch worked differently (I didn't actually use any code from
it). The main reason why I add the byte is that I am storing the
algorithm in the compressed value itself, not in varlena header. I was
mainly trying to not have every caller care about storing and loading
the compression algorithm. I also can't say I particularly like that
hack in your patch.

However if we'd want to have separate GUCs for TOAST and WAL then we'll
have to do that anyway so maybe it does not matter anymore (we can't use
similar hack there AFAICS though).

>
>> - changes the rawsize in TOAST header to 31 bits via bit packing
>> - uses the extra bit to differentiate between old and new format
>
> Hm. Wouldn't it be easier to just use a different vartag for this?
>

That would only work for external TOAST pointers right? The compressed
varlena can also be stored inline and potentially in index tuple.

>
>> - I expect my changes to configure.in are not the greatest as I don't have
>> pretty much zero experience with autoconf
>
> FWIW the configure output changes are likely because you used a modified
> version of autoconf. Unfortunately debian/ubuntu ship one with vendor
> patches.
>

Yeah, Ubuntu here, that explains.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul A Jungwirth 2019-08-04 22:11:21 Re: SQL:2011 PERIODS vs Postgres Ranges?
Previous Message Tomas Vondra 2019-08-04 21:41:54 Re: idea: log_statement_sample_rate - bottom limit for sampling