Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Date: 2011-09-29 21:08:23
Message-ID: 4E84DE47.8070603@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29.09.2011 20:27, Kevin Grittner wrote:
> Heikki's second version, a more radical revision optimized for 64
> bit systems, blows up on a 32 bit compile, writing off the end of
> the structure. Personally, I'd be OK with sacrificing some
> performance for 32 bit systems to get better performance on 64 bit
> systems, since people who care about performance generally seem to
> be on 64 bit builds these days -- but it has to run. Given Tom's
> reservations about this approach, I don't know whether Heikki is
> interested in fixing the crash so it can be benchmarked. Heikki?

No, I'm not going to work on that 64-bit patch.

Looking at the big picture, however, the real problem with all those
makesign() calls is that they happen in the first place. They happen
when gist needs to choose which child page to place a new tuple on. It
calls the penalty for every item on the internal page, always passing
the new key as the 2nd argument, along the lines of:

for (all items on internal page)
penalty(item[i], newitem);

At every call, gtrgm_penalty() has to calculate the signature for
newitem, using makesign(). That's an enormous waste of effort, but
there's currently no way gtrgm_penalty() to avoid that. If we could call
makesign() only on the first call in the loop, and remember it for the
subsequent calls, that would eliminate the need for any
micro-optimization in makesign() and make inserting into a trigram index
much faster (including building the index from scratch).

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-29 21:16:23 Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Previous Message Bruce Momjian 2011-09-29 21:07:51 Re: pg_upgrade - add config directory setting