Quick Links

Re: pglz performance

From:	Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To:	Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vladimir Leskov <vladimirlesk(at)yandex-team(dot)ru>
Subject:	Re: pglz performance
Date:	2019-08-02 13:45:43
Message-ID:	ea57b49a-ecf0-481a-a77b-631833354f7d@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 27.06.2019 21:33, Andrey Borodin wrote:
>
>> 13 мая 2019 г., в 12:14, Michael Paquier <michael(at)paquier(dot)xyz> написал(а):
>>
>> Decompression can matter a lot for mostly-read workloads and
>> compression can become a bottleneck for heavy-insert loads, so
>> improving compression or decompression should be two separate
>> problems, not two problems linked. Any improvement in one or the
>> other, or even both, is nice to have.
> Here's patch hacked by Vladimir for compression.
>
> Key differences (as far as I see, maybe Vladimir will post more complete list of optimizations):
> 1. Use functions instead of macro-functions: not surprisingly it's easier to optimize them and provide less constraints for compiler to optimize.
> 2. More compact hash table: use indexes instead of pointers.
> 3. More robust segment comparison: like memcmp, but return index of first different byte
>
> In weighted mix of different data (same as for compression), overall speedup is x1.43 on my machine.
>
> Current implementation is integrated into test_pglz suit for benchmarking purposes[0].
>
> Best regards, Andrey Borodin.
>
> [0] https://github.com/x4m/test_pglz

It takes me some time to understand that your memcpy optimization is
correct;)
I have tested different ways of optimizing this fragment of code, but
failed tooutperform your implementation!
Results at my computer is simlar with yours:

Decompressor score (summ of all times):
NOTICE: Decompressor pglz_decompress_hacked result 6.627355
NOTICE: Decompressor pglz_decompress_hacked_unrolled result 7.497114
NOTICE: Decompressor pglz_decompress_hacked8 result 7.412944
NOTICE: Decompressor pglz_decompress_hacked16 result 7.792978
NOTICE: Decompressor pglz_decompress_vanilla result 10.652603

Compressor score (summ of all times):
NOTICE: Compressor pglz_compress_vanilla result 116.970005
NOTICE: Compressor pglz_compress_hacked result 89.706105

But ... below are results for lz4:

Decompressor score (summ of all times):
NOTICE: Decompressor lz4_decompress result 3.660066
Compressor score (summ of all times):
NOTICE: Compressor lz4_compress result 10.288594

There is 2 times advantage in decompress speed and 10 times advantage in
compress speed.
So may be instead of "hacking" pglz algorithm we should better switch to
lz4?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Re: pglz performance at 2019-06-27 18:33:16 from Andrey Borodin

Responses

Re: pglz performance at 2019-08-02 14:43:45 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2019-08-02 14:11:30	Re: Recent failures in IsolationCheck deadlock-hard
Previous Message	vignesh C	2019-08-02 13:12:49	Re: block-level incremental backup