Skip site navigation (1) Skip section navigation (2)

Re: rtree/gist index taking enormous amount of space in 8.2.3

From: "Dolafi, Tom" <dolafit(at)janelia(dot)hhmi(dot)org>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>,"Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>,"Teodor Sigaev" <teodor(at)sigaev(dot)ru>
Subject: Re: rtree/gist index taking enormous amount of space in 8.2.3
Date: 2007-06-29 19:40:30
Message-ID: AE9860225100F14D87B26D0D4D6766DB46F452@EXCHANGE03.janelia.priv (view raw or flat)
Thread:
Lists: pgsql-performance
The application need is to determine genomic features present in a
user-defined portion of a chromosome.  My guess is that features (boxes)
are overlapping along a line (chromosome), and there is a need to
represent them as being stacked.  Since I'm not certain of its exact
use, I've emailed the application owner to find the motivation as to why
a geometric index structure is used, and why the boxes are tall and
overlapping.  As a side note, the data model for our application is
based on a popular bioinformatics open source project called chado.

Thanks,
Tom
-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us] 
Sent: Friday, June 29, 2007 2:38 PM
To: Dolafi, Tom
Cc: pgsql-performance(at)postgresql(dot)org; Oleg Bartunov; Teodor Sigaev
Subject: Re: [PERFORM] rtree/gist index taking enormous amount of space
in 8.2.3 

"Dolafi, Tom" <dolafit(at)janelia(dot)hhmi(dot)org> writes:
> In the mean time I've dropped the index which has resulted in overall
> performance gain on queries against the table, but we have not tested
> the part of the application which would utilize this index.

I noted that with the same (guessed-at) distribution of fmin/fmax, the
index size remains reasonable if you change the derived boxes to

CREATE OR REPLACE FUNCTION boxrange(integer, integer)
  RETURNS box AS
    'SELECT box (point($1, $1), point($2, $2))'
  LANGUAGE 'sql' STRICT IMMUTABLE;

which makes sense from the point of view of geometric intuition: instead
of a bunch of very tall, mostly very narrow, mostly overlapping boxes,
you have a bunch of small square boxes spread out along a line.  So it
stands to reason that a geometrically-motivated index structure would
work a lot better on the latter.  I don't know though whether your
queries can be adapted to work with this.  What was the index being used
for, exactly?

			regards, tom lane

In response to

pgsql-performance by date

Next:From: Ed TyrrillDate: 2007-06-29 21:01:03
Subject: Re:
Previous:From: Dolafi, TomDate: 2007-06-29 18:44:31
Subject: Re: rtree/gist index taking enormous amount of space in 8.2.3

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group