Re: CREATE INDEX rather sluggish

From: Gavin Hamill <gdh(at)laterooms(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: CREATE INDEX rather sluggish
Date: 2006-03-30 20:45:31
Message-ID: 20060330214531.ce6620e1.gdh@laterooms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 30 Mar 2006 18:08:44 +0100
Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

Hello again Simon :)

> The index build time varies according to the number and type of the
> datatypes, as well as the distribution of values in the table. As well
> as the number of rows in the table.
>
> Note the x10 factor to index AreaID (integer) v KeywordType (vchar(20))

Fair enough. :) Is there much of a performance increase by using fixed-length character fields instead of varchars?

> Try trace_sort = on and then rerun the index builds to see what's
> happening there. We've speeded sort up by about 2.5 times in the current
> development version, but it does just run in single threaded mode so
> your 8 CPUs aren't helping there.

Yum - I look forward to the 8.2 release =)

> Looks like you might be just over the maintenance_work_mem limit for the
> last index builds. You can try doubling maintenance_work_mem.

You were right - needed ~370MB ... I'm happy to alloc 1GB to allow for db growth..

> The extended runtime for KeywordType is interesting in comparison to
> LowerText, which on the face of it is a longer column. My guess would be
> that LowerText is fairly unique and sorts quickly, whereas KeywordType
> is fairly non-unique with a high average row length that require
> complete string comparison before deciding it is actually the same
> value.

From looking at a few samples of the millions of rows it seems that it's actually KeywordType that's more unique - LowerText is simply an lowercase representation of the name of this search-keyword, so it's much less unique. Fun stuff :)

> You might want to try using codes rather than textual KeywordTypes.

That makes sense - I can't get a grip on the data in KeywordType at the moment .. many are more obvious like 'RGN' 'AREA' 'MKT' 'LK' for Region, Area, Market and Lake, but many other rows have '1'.

> You might try using partial indexes also, along the lines of
>
> CREATE INDEX ix_keyword_type ON "Keyword" USING btree ("KeywordType") WHERE KeywordType IS NOT NULL;

Well, each row does have a KeywordType, so no row has a NULL entry...

> Best Regards, Simon Riggs

Cheers :)
Gavin.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message PFC 2006-03-30 21:31:25 Re: Decide between Postgresql and Mysql (help of
Previous Message Gavin Hamill 2006-03-30 20:21:38 Re: CREATE INDEX rather sluggish