Quick Links

pglz performance

From:	Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Vladimir Leskov <vladimirlesk(at)yandex-team(dot)ru>
Subject:	pglz performance
Date:	2019-05-13 02:45:59
Message-ID:	469C9ED9-348C-4FE7-A7A7-B0FA671BEE4C@yandex-team.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers!

I was reviewing Paul Ramsey's TOAST patch[0] and noticed that there is a big room for improvement in performance of pglz compression and decompression.

With Vladimir we started to investigate ways to boost byte copying and eventually created test suit[1] to investigate performance of compression and decompression.
This is and extension with single function test_pglz() which performs tests for different:
1. Data payloads
2. Compression implementations
3. Decompression implementations

Currently we test mostly decompression improvements against two WALs and one data file taken from pgbench-generated database. Any suggestion on more relevant data payloads are very welcome.
My laptop tests show that our decompression implementation [2] can be from 15% to 50% faster.
Also I've noted that compression is extremely slow, ~30 times slower than decompression. I believe we can do something about it.

We focus only on boosting existing codec without any considerations of other compression algorithms.

Any comments are much appreciated.

Most important questions are:
1. What are relevant data sets?
2. What are relevant CPUs? I have only XEON-based servers and few laptops\desktops with intel CPUs
3. If compression is 30 times slower, should we better focus on compression instead of decompression?

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/CANP8%2BjKcGj-JYzEawS%2BCUZnfeGKq4T5LswcswMP4GUHeZEP1ag%40mail.gmail.com
[1] https://github.com/x4m/test_pglz
[2] https://www.postgresql.org/message-id/C2D8E5D5-3E83-469B-8751-1C7877C2A5F2%40yandex-team.ru

Responses

Re: pglz performance at 2019-05-13 07:14:27 from Michael Paquier
Re: pglz performance at 2019-09-12 18:58:18 from Alvaro Herrera

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2019-05-13 03:09:52	Re: cleanup & refactoring on reindexdb.c
Previous Message	David Rowley	2019-05-13 02:19:45	Re: PostgreSQL 12: Feature Highlights