Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Date: 2011-07-02 18:37:28
Message-ID: 4E0F6568.6010809@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.07.2011 21:07, Tom Lane wrote:
> I wrote:
>> I tweaked the patch a bit further (don't pessimize the boundary case
>> where there's exactly 4n+1 trigrams, avoid forcing trg1 into memory,
>> improve the comment) and attach that version below. This is a little
>> bit faster but I still wonder if it's worth the price of adding an
>> obscure dependency on the header size.
>
> It occurred to me that we could very easily remove that objection by
> making the code dynamically detect when it's reached a suitably aligned
> trigram. The attached version of the patch does it that way. It seems
> to be a percent or so slower than my previous version, but I think that
> making the function robust against header changes is probably well worth
> that price.

Ah, that opens the door to do something I considered earlier but
rejected because of alignment: instead of three 32-bit word fetches, we
could fetch one 64-bit word and 32-bit word. Might shave a few more
cycles...

Meanwhile, I experimented with optimizing the other part of the loop:
the HASH() macros for setting the right bits in the signature. With the
default compile-time settings, the signature array is 95 bits.
Currently, it's stored in a BITVEC, which is a byte array, but we could
store it in two 64-bit integers, which makes it possible to write SETBIT
differently. I experimented with a few approaches, first was essentially
this:

+ /* Set the nth bit in the signature, in s1 or s2 */
+ #define HASH_S(h) \
+ do { \
+ unsigned int n = HASHVAL(h); \
+ if (((uint64) (n)) < 64) \
+ s1 |= (uint64) 1<<(n); \
+ else \
+ s2 |= (uint64) 1<<((n) - 64); \
+ } while(0)

That was a bit faster on my x64 laptop, but slightly slower on my ia64
HP-UX box. My second try was to use lookup tables, patch attached. That
was yet faster on x64, and a small win on the ia64 box too. I'm not sure
it's worth the added code complexity, though.

Here's a summary of the timings I got with different versions:

ia64 HP-UX (anole):

unpatched: 11194.038 ms
fast_makesign-tom: 10064.980 ms
fast_makesign-2int: 10649.726 ms
fast_makesign-tbl: 9951.547 ms

x64 laptop:

unpatched: 4595,209 ms
fast_makesign-tom: 3346,548 ms
fast_makesign-2int: 3102,874 ms
fast_makesign-tbl: 2997,854 ms

I used the same "REINDEX INDEX i_words" test I used earlier, repeated
each run a couple of times, and took the lowest number.
fast_makesign-tom is the first patch you posted, I haven't tested your
latest one. fast_makesign-2int is with the HASH_S() macro above, and
has_makesign-tbl is with the attached patch.

PS. in case you wonder why the HP-UX box is so much slower than my
laptop; this box isn't really meant for performance testing. It's just
something I happen to have access to, I think it's a virtual machine of
some sort. The numbers are very repeatable, however.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
fast_makesign-tbl.patch text/x-diff 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-07-02 18:46:10 Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Previous Message Michael Mueller 2011-07-02 18:10:00 Potential NULL dereference found in typecmds.c