Re: GiST index performance

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: GiST index performance
Date: 2010-03-16 16:18:19
Message-ID: 4B9FAF4B.20907@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Matthew Wakeling wrote:
>> Matthew Wakeling wrote:
>>> A second quite distinct issue is the general performance of GiST
>>> indexes
>>> which is also mentioned in the old thread linked from Open Items. For
>>> that, we have a test case at
>>> http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php
>>> for
>>> btree_gist indexes. I have a similar example with the bioseg GiST
>>> index. I
>>> have completely reimplemented the same algorithms in Java for algorithm
>>> investigation and instrumentation purposes, and it runs about a hundred
>>> times faster than in Postgres. I think this is a problem, and I'm
>>> willing
>>> to do some investigation to try and solve it.
> I have not made any progress on this issue. I think Oleg and Teodor
> would be better placed working it out. All I can say is that I
> implemented the exact same indexing algorithm in Java, and it
> performed 100 times faster than Postgres. Now, Postgres has to do a
> lot of additional work, like mapping the index onto disc, locking
> pages, and abstracting to plugin user functions, so I would expect
> some difference - I'm not sure 100 times is reasonable though. I tried
> to do some profiling, but couldn't see any one section of code that
> was taking too much time. Not sure what I can further do.
Hello Mathew and list,

A lot of time spent in gistget.c code and a lot of functioncall5's to
the gist's consistent function which is out of sight for gprof.
Something different but related since also gist: we noticed before that
gist indexes that use a compressed form for index entries suffer from
repeated compress calls on query operands (see
http://archives.postgresql.org/pgsql-hackers/2009-05/msg00078.php).

The btree_gist int4 compress function calls the generic
gbt_num_compress, which does a palloc. Maybe this palloc is allso hit al
lot when scanning the index, because the constants that are queries with
are repeatedly compressed and palloced.

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-03-16 16:58:08 Re: Postgres DB maintainenance - vacuum and reindex
Previous Message Greg Stark 2010-03-16 14:53:38 Re: shared_buffers advice