Re: GiST index performance

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: GiST index performance
Date: 2010-03-22 14:02:30
Message-ID: 4BA77876.30803@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Matthew Wakeling wrote:
> On Sat, 20 Mar 2010, Yeb Havinga wrote:
>> The gist virtual pages would then match more the original blocksizes
>> that
>> were used in Guttman's R-tree paper (first google result, then figure
>> 4.5).
>> Since the nature/characteristics of the underlying datatypes and keys
>> is not
>> changed, it might be that with the disk pages getting larger, gist
>> indexing
>> has therefore become unexpectedly inefficient.
>
> Yes, that is certainly a factor. For example, the page size for bioseg
> which we use here is 130 entries, which is very excessive, and doesn't
> allow very deep trees. On the other hand, it means that a single disc
> seek performs quite a lot of work.
Yeah, I only did in-memory fitting tests and wondered about increased
io's. However I bet that even for bigger than ram db's, the benefit of
having to fan out to less pages still outweighs the over-general non
leaf nodes and might still result in less disk io's. I redid some
earlier benchmarking with other datatypes with a 1kB block size and also
multicolumn gist and the multicolumn variant had an ever greater benefit
than the single column indexes, both equality and range scans. (Like
execution times down to 20% of original). If gist is important to you, I
really recommend doing a test with 1kB blocks.

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2010-03-22 14:23:50 Re: GiST index performance
Previous Message Matthew Wakeling 2010-03-22 13:29:49 Re: GiST index performance