Re: pglz performance

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Gasper Zejn <zejn(at)owca(dot)info>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pglz performance
Date: 2019-11-26 09:43:24
Message-ID: 20191126094324.7y2n3n3houetavwd@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 25, 2019 at 05:29:40PM +0900, Michael Paquier wrote:
>On Mon, Nov 25, 2019 at 01:21:27PM +0500, Andrey Borodin wrote:
>> I think status Needs Review describes what is going on better. It's
>> not like something is awaited from my side.
>
>Indeed. You are right so I have moved the patch instead, with "Needs
>review". The patch status was actually incorrect in the CF app, as it
>was marked as waiting on author.
>
>@Tomas: updated versions of the patches have been sent by Andrey.

I've done benchmarks on the two last patches, using the data sets from
test_pglz repository [1], but using three simple queries:

1) prefix - first 100 bytes of the value

SELECT length(substr(value, 0, 100)) FROM t

2) infix - 100 bytes from the middle

SELECT length(substr(value, test_length/2, 100)) FROM t

3) suffix - last 100 bytes

SELECT length(substr(value, test_length - 100, 100)) FROM t

See the two attached scripts, implementing this benchmark. The test
itself did a 60-second pgbench runs (single client) measuring tps on two
different machines.

patch 1: v4-0001-Use-memcpy-in-pglz-decompression.patch
patch 2: v4-0001-Use-memcpy-in-pglz-decompression-for-long-matches.patch

The results (compared to master) from the first machine (i5-2500k CPU)
look like this:

patch 1 | patch 2
dataset prefix infix suffix | prefix infix suffix
-------------------------------------------------------------------------
000000010000000000000001 99% 134% 161% | 100% 126% 152%
000000010000000000000006 99% 260% 287% | 100% 257% 279%
000000010000000000000008 100% 100% 100% | 100% 95% 91%
16398 100% 168% 221% | 100% 159% 215%
shakespeare.txt 100% 138% 141% | 100% 116% 117%
mr 99% 120% 128% | 100% 107% 108%
dickens 100% 129% 132% | 100% 100% 100%
mozilla 100% 119% 120% | 100% 102% 104%
nci 100% 149% 141% | 100% 143% 135%
ooffice 99% 121% 123% | 100% 97% 98%
osdb 100% 99% 99% | 100% 100% 99%
reymont 99% 130% 132% | 100% 106% 107%
samba 100% 126% 132% | 100% 105% 111%
sao 100% 100% 99% | 100% 100% 100%
webster 100% 127% 127% | 100% 106% 106%
x-ray 99% 99% 99% | 100% 100% 100%
xml 100% 144% 144% | 100% 130% 128%

and on the other one (xeon e5-2620v4) looks like this:

patch 1 | patch 2
dataset prefix infix suffix | prefix infix suffix
------------------------------------------------------------------------
000000010000000000000001 98% 147% 170% | 98% 132% 159%
000000010000000000000006 100% 340% 314% | 98% 334% 355%
000000010000000000000008 99% 100% 105% | 99% 99% 101%
16398 101% 153% 205% | 99% 148% 201%
shakespeare.txt 100% 147% 149% | 99% 117% 118%
mr 100% 131% 139% | 99% 112% 108%
dickens 100% 143% 143% | 99% 103% 102%
mozilla 100% 122% 122% | 99% 105% 106%
nci 100% 151% 135% | 100% 135% 125%
ooffice 99% 127% 129% | 98% 101% 102%
osdb 102% 100% 101% | 102% 100% 99%
reymont 101% 142% 143% | 100% 108% 108%
samba 100% 132% 136% | 99% 109% 112%
sao 99% 101% 100% | 99% 100% 100%
webster 100% 132% 129% | 100% 106% 106%
x-ray 99% 101% 100% | 90% 101% 101%
xml 100% 147% 148% | 100% 127% 125%

In general, I think the results for both patches seem clearly a win, but
maybe patch 1 is bit better, especially on the newer (xeon) CPU. So I'd
probably go with that one.

[1] https://github.com/x4m/test_pglz

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
pglz-test.sh application/x-sh 766 bytes
pglz-load.sql application/sql 2.1 KB
pglz.ods application/vnd.oasis.opendocument.spreadsheet 22.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2019-11-26 10:40:17 Re: Attempt to consolidate reading of XLOG page
Previous Message Phil Florent 2019-11-26 08:29:26 RE: GROUPING SETS and SQL standard