From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Subject: | Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build) |
Date: | 2011-07-02 18:37:28 |
Message-ID: | 4E0F6568.6010809@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02.07.2011 21:07, Tom Lane wrote:
> I wrote:
>> I tweaked the patch a bit further (don't pessimize the boundary case
>> where there's exactly 4n+1 trigrams, avoid forcing trg1 into memory,
>> improve the comment) and attach that version below. This is a little
>> bit faster but I still wonder if it's worth the price of adding an
>> obscure dependency on the header size.
>
> It occurred to me that we could very easily remove that objection by
> making the code dynamically detect when it's reached a suitably aligned
> trigram. The attached version of the patch does it that way. It seems
> to be a percent or so slower than my previous version, but I think that
> making the function robust against header changes is probably well worth
> that price.
Ah, that opens the door to do something I considered earlier but
rejected because of alignment: instead of three 32-bit word fetches, we
could fetch one 64-bit word and 32-bit word. Might shave a few more
cycles...
Meanwhile, I experimented with optimizing the other part of the loop:
the HASH() macros for setting the right bits in the signature. With the
default compile-time settings, the signature array is 95 bits.
Currently, it's stored in a BITVEC, which is a byte array, but we could
store it in two 64-bit integers, which makes it possible to write SETBIT
differently. I experimented with a few approaches, first was essentially
this:
+ /* Set the nth bit in the signature, in s1 or s2 */
+ #define HASH_S(h) \
+ do { \
+ unsigned int n = HASHVAL(h); \
+ if (((uint64) (n)) < 64) \
+ s1 |= (uint64) 1<<(n); \
+ else \
+ s2 |= (uint64) 1<<((n) - 64); \
+ } while(0)
That was a bit faster on my x64 laptop, but slightly slower on my ia64
HP-UX box. My second try was to use lookup tables, patch attached. That
was yet faster on x64, and a small win on the ia64 box too. I'm not sure
it's worth the added code complexity, though.
Here's a summary of the timings I got with different versions:
ia64 HP-UX (anole):
unpatched: 11194.038 ms
fast_makesign-tom: 10064.980 ms
fast_makesign-2int: 10649.726 ms
fast_makesign-tbl: 9951.547 ms
x64 laptop:
unpatched: 4595,209 ms
fast_makesign-tom: 3346,548 ms
fast_makesign-2int: 3102,874 ms
fast_makesign-tbl: 2997,854 ms
I used the same "REINDEX INDEX i_words" test I used earlier, repeated
each run a couple of times, and took the lowest number.
fast_makesign-tom is the first patch you posted, I haven't tested your
latest one. fast_makesign-2int is with the HASH_S() macro above, and
has_makesign-tbl is with the attached patch.
PS. in case you wonder why the HP-UX box is so much slower than my
laptop; this box isn't really meant for performance testing. It's just
something I happen to have access to, I think it's a virtual machine of
some sort. The numbers are very repeatable, however.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
fast_makesign-tbl.patch | text/x-diff | 7.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-07-02 18:46:10 | Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build) |
Previous Message | Michael Mueller | 2011-07-02 18:10:00 | Potential NULL dereference found in typecmds.c |