Re: pglz compression performance, take two

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pglz compression performance, take two
Date: 2021-06-28 16:05:04
Message-ID: 90649B7F-EF71-40F7-93D0-1029AD0D90F3@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jun 27, 2021, at 3:41 AM, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> And here's what I've come up with.

I have not tested the patch yet, but here are some quick review comments:

> #define PGLZ_HISTORY_SIZE 0x0fff - 1 /* to avoid compare in iteration */
...
> static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE + 1];
...
> if (hist_next == PGLZ_HISTORY_SIZE + 1)

These are the only uses of PGLZ_HISTORY_SIZE. Perhaps you could just defined the symbol as 0x0fff and skip the -1 and +1 business?

> /* ----------
> * pglz_compare -
> *
> * Compares 4 bytes at pointers
> * ----------
> */
> static inline bool
> pglz_compare32(const void *ptr1, const void *ptr2)
> {
> return memcmp(ptr1, ptr2, 4) == 0;
> }

The comment function name differs from the actual function name.

Also, pglz_compare returns an offset into the string, whereas pglz_compare32 returns a boolean. This is fairly unintuitive. The "32" part of pglz_compare32 implies doing the same thing as pglz_compare but where the string is known to be 4 bytes in length. Given that pglz_compare32 is dissimilar to pglz_compare, perhaps avoid using /pglz_compare/ in its name?

> /*
> * Determine length of match. A better match must be larger than the
> * best so far. And if we already have a match of 16 or more bytes,
> * it's worth the call overhead to use memcmp()

This comment is hard to understand, given the code that follows. The first block calls memcmp(), which seems to be the function overhead you refer to. The second block calls the static inline function pglz_compare32, which internally calls memcmp(). Superficially, there seems to be a memcmp() function call either way. The difference is that in the first block's call to memcmp(), the length is a runtime value, and in the second block, it is a compile-time known value. If you are depending on the compiler to notice this distinction and optimize the second call, perhaps you can mention that explicitly? Otherwise, reading and understanding the comment takes more effort.

I took a quick look for other places in the code that try to beat the performance of memcmp on short strings. In varlena.c, rest_of_char_same() seems to do so. We also use comparisons on NameData, which frequently contains strings shorter than 16 bytes. Is it worth sharting a static inline function that uses your optimization in other places? How confident are you that your optimization really helps?


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-06-28 16:33:53 Re: pglz compression performance, take two
Previous Message Andrew Dunstan 2021-06-28 15:43:01 Re: code fork June 28th