Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Date: 2011-09-29 23:03:01
Message-ID: 23146.1317337381@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> On Fri, Sep 30, 2011 at 1:08 AM, Heikki Linnakangas <
> heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> At every call, gtrgm_penalty() has to calculate the signature for newitem,
>> using makesign(). That's an enormous waste of effort, but there's currently
>> no way gtrgm_penalty() to avoid that. If we could call makesign() only on
>> the first call in the loop, and remember it for the subsequent calls, that
>> would eliminate the need for any micro-optimization in makesign() and make
>> inserting into a trigram index much faster (including building the index
>> from scratch)

> Isn't it possible to cache signature of newitem in gtrgm_penalty
> like gtrgm_consistent do this for query?

[ studies that code for awhile ... ] Ick, what a kluge.

The main problem with that code is that the cache data gets leaked at
the conclusion of a scan. Having just seen the consequences of leaking
the "giststate", I think this is something we need to fix not emulate.

I wonder whether it's worth having the GIST code create a special
scan-lifespan (or insert-lifespan) memory context that could be used
for cached data such as this? It's already creating a couple of
contexts for its own purposes, so one more might not be a big problem.
We'd have to figure out a way to make that context available to GIST
support functions, though, as well as something cleaner than fn_extra
for them to keep pointers in.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2011-09-29 23:27:08 Re: pg_regress input/output directory option
Previous Message Tom Lane 2011-09-29 22:18:27 Re: pg_upgrade - add config directory setting