From: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
---|---|
To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vladimir Leskov <vladimirlesk(at)yandex-team(dot)ru> |
Subject: | Re: pglz performance |
Date: | 2019-08-02 13:45:43 |
Message-ID: | ea57b49a-ecf0-481a-a77b-631833354f7d@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 27.06.2019 21:33, Andrey Borodin wrote:
>
>> 13 мая 2019 г., в 12:14, Michael Paquier <michael(at)paquier(dot)xyz> написал(а):
>>
>> Decompression can matter a lot for mostly-read workloads and
>> compression can become a bottleneck for heavy-insert loads, so
>> improving compression or decompression should be two separate
>> problems, not two problems linked. Any improvement in one or the
>> other, or even both, is nice to have.
> Here's patch hacked by Vladimir for compression.
>
> Key differences (as far as I see, maybe Vladimir will post more complete list of optimizations):
> 1. Use functions instead of macro-functions: not surprisingly it's easier to optimize them and provide less constraints for compiler to optimize.
> 2. More compact hash table: use indexes instead of pointers.
> 3. More robust segment comparison: like memcmp, but return index of first different byte
>
> In weighted mix of different data (same as for compression), overall speedup is x1.43 on my machine.
>
> Current implementation is integrated into test_pglz suit for benchmarking purposes[0].
>
> Best regards, Andrey Borodin.
>
> [0] https://github.com/x4m/test_pglz
It takes me some time to understand that your memcpy optimization is
correct;)
I have tested different ways of optimizing this fragment of code, but
failed tooutperform your implementation!
Results at my computer is simlar with yours:
Decompressor score (summ of all times):
NOTICE: Decompressor pglz_decompress_hacked result 6.627355
NOTICE: Decompressor pglz_decompress_hacked_unrolled result 7.497114
NOTICE: Decompressor pglz_decompress_hacked8 result 7.412944
NOTICE: Decompressor pglz_decompress_hacked16 result 7.792978
NOTICE: Decompressor pglz_decompress_vanilla result 10.652603
Compressor score (summ of all times):
NOTICE: Compressor pglz_compress_vanilla result 116.970005
NOTICE: Compressor pglz_compress_hacked result 89.706105
But ... below are results for lz4:
Decompressor score (summ of all times):
NOTICE: Decompressor lz4_decompress result 3.660066
Compressor score (summ of all times):
NOTICE: Compressor lz4_compress result 10.288594
There is 2 times advantage in decompress speed and 10 times advantage in
compress speed.
So may be instead of "hacking" pglz algorithm we should better switch to
lz4?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-08-02 14:11:30 | Re: Recent failures in IsolationCheck deadlock-hard |
Previous Message | vignesh C | 2019-08-02 13:12:49 | Re: block-level incremental backup |