Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Date: 2011-09-29 21:16:23
Message-ID: 15195.1317330983@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Looking at the big picture, however, the real problem with all those
> makesign() calls is that they happen in the first place. They happen
> when gist needs to choose which child page to place a new tuple on. It
> calls the penalty for every item on the internal page, always passing
> the new key as the 2nd argument, along the lines of:

> for (all items on internal page)
> penalty(item[i], newitem);

> At every call, gtrgm_penalty() has to calculate the signature for
> newitem, using makesign(). That's an enormous waste of effort, but
> there's currently no way gtrgm_penalty() to avoid that.

Hmm. Are there any other datatypes for which the penalty function has
to duplicate effort? I'm disinclined to fool with this if pg_trgm is
the only example ... but if it's not, maybe we should do something
about that instead of micro-optimizing makesign.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-29 21:20:39 Re: pg_upgrade - add config directory setting
Previous Message Heikki Linnakangas 2011-09-29 21:08:23 Re: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)