Re: Creation of tsearch2 index is very slow

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Vollmer <svollmer(at)gmx(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Creation of tsearch2 index is very slow
Date: 2006-01-20 15:35:21
Message-ID: 17454.1137771321@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Stephan Vollmer <svollmer(at)gmx(dot)de> writes:
> I noticed that the creation of a GIST index for tsearch2 takes very
> long - about 20 minutes. CPU utilization is 100 %, the resulting
> index file size is ~25 MB. Is this behaviour normal?

This has been complained of before. GIST is always going to be slower
at index-building than btree; in the btree case there's a simple optimal
strategy for making an index (ie, sort the keys) but for GIST we don't
know anything better to do than insert the keys one at a time.

However, I'm not sure that anyone's tried to do any performance
optimization on the GIST insert code ... there might be some low-hanging
fruit there. It'd be interesting to look at a gprof profile of what the
backend is doing during the index build. Do you have the ability to do
that, or would you be willing to give your data to someone else to
investigate with? (The behavior is very possibly data-dependent, which
is why I want to see a profile with your specific data and not just some
random dataset or other.)

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message DANTE ALEXANDRA 2006-01-20 15:52:37 Re: Initdb panic: invalid record offset at 0/0 creating
Previous Message Keary Suska 2006-01-20 15:32:47 Re: How to fetch rows with multiple values

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2006-01-20 16:19:32 Re: Extremely irregular query performance
Previous Message Neil Conway 2006-01-20 15:14:59 Re: Retaining execution plans between connections?