Re: [GENERAL] Creation of tsearch2 index is very slow

From: "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [GENERAL] Creation of tsearch2 index is very slow
Date: 2006-01-20 22:57:51
Message-ID: 20060120225751.GA27230@uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Fri, Jan 20, 2006 at 05:50:36PM -0500, Tom Lane wrote:
> Yeah, but fetching from a small constant table is pretty quick too;
> I doubt it's worth getting involved in machine-specific assembly code
> for this. I'm much more interested in the idea of improving the
> furthest-distance algorithm in gtsvector_picksplit --- if we can do
> that, it'll probably drop the distance calculation down to the point
> where it's not really worth the trouble to assembly-code it.

For the record: Could we do with a less-than-optimal split here? In that
case, an extremely simple heuristic is:

best = distance(0, 1)
best_i = 0
best_j = 1

for i = 2..last:
if distance(best_i, i) > best:
best = distance(best_i, i)
best_j = i
else if distance(best_j, i) > best:
best = distance(best_j, i)
best_i = i

I've tested it on various data, and although it's definitely not _correct_,
it generally gets within 10%.

/* Steinar */
--
Homepage: http://www.sesse.net/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steinar H. Gunderson 2006-01-20 22:59:32 Re: [GENERAL] Creation of tsearch2 index is very slow
Previous Message Martijn van Oosterhout 2006-01-20 22:57:20 Re: [GENERAL] Creation of tsearch2 index is very slow

Browse pgsql-performance by date

  From Date Subject
Next Message Steinar H. Gunderson 2006-01-20 22:59:32 Re: [GENERAL] Creation of tsearch2 index is very slow
Previous Message Martijn van Oosterhout 2006-01-20 22:57:20 Re: [GENERAL] Creation of tsearch2 index is very slow