Re: pglz performance

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Gasper Zejn <zejn(at)owca(dot)info>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pglz performance
Date: 2019-10-21 09:09:29
Message-ID: 04E4EC5C-B603-4EF8-88C3-EA5CB5A59066@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 28 сент. 2019 г., в 10:29, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> написал(а):
>
> I hope to benchmark decompression on Silesian corpus soon.

I've done it. And results are quite controversial.
Dataset adds 12 payloads to our 5. Payloads have relatively high entropy. In many cases pglz cannot compress them at all, so decompression is nop, data is stored as is.

Decompressor pglz_decompress_hacked result 48.281747
Decompressor pglz_decompress_hacked8 result 33.868779
Decompressor pglz_decompress_vanilla result 42.510165

Tested on Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz

With Silesian corpus pglz_decompress_hacked is actually decreasing performance on high-entropy data.
Meanwhile pglz_decompress_hacked8 is still faster than usual pglz_decompress.
In spite of this benchmarks, I think that pglz_decompress_hacked8 is safer option.

I've updated test suite [0] and anyone interested can verify benchmarks.

--
Andrey Borodin
Open source RDBMS development team leader
Yandex.Cloud

[0] https://github.com/x4m/test_pglz

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2019-10-21 09:12:04 Re: Questions/Observations related to Gist vacuum
Previous Message Andrey Borodin 2019-10-21 09:00:47 Re: Questions/Observations related to Gist vacuum