Skip site navigation (1) Skip section navigation (2)

Re: Optimize date query for large child tables: GiST or GIN?

From: David Jarvis <thangalin(at)gmail(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thombrown(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Date: 2010-05-21 04:11:52
Message-ID: AANLkTimjPLmnSZDe3nNBVGjJ0CGIQd_Ic8IxHb4fUG9v@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

(An EXPLAIN ANALYSE would be better here). Look at the expected number of
> stations


"Nested Loop  (cost=0.00..994.94 rows=4046 width=4) (actual
time=0.053..41.173 rows=78 loops=1)"
"  Join Filter: ((6371.009::double precision *
sqrt((pow(radians(((c.latitude_decimal - s.latitude_decimal))::double
precision), 2::double precision) + (cos((radians(((c.latitude_decimal +
s.latitude_decimal))::double precision) / 2::double precision)) *
pow(radians(((c.longitude_decimal - s.longitude_decimal))::double
precision), 2::double precision))))) <= 25::double precision)"
"  ->  Index Scan using city_pkey1 on city c  (cost=0.00..6.27 rows=1
width=16) (actual time=0.014..0.016 rows=1 loops=1)"
"        Index Cond: (id = 5182)"
"  ->  Seq Scan on station s  (cost=0.00..321.08 rows=12138 width=20)
(actual time=0.007..5.256 rows=12139 loops=1)"
"        Filter: ((s.elevation >= 0) AND (s.elevation <= 3000))"
"Total runtime: 41.235 ms"

expects to have to touch a large proportion of the measurement table,
> therefore it thinks that it will be fastest to do a seq scan. In actual
> fact, for 78 stations, the index would be faster, but for 4046 it wouldn't.
>

This is rather unexpected. I'd have figured it would use the actual number.


> If you will be querying by season quite regularly, had you considered
> partitioning by season?
>

I have no idea what the "regular" queries will be. The purpose of the system
is to open the data up to the public using a simple user interface so that
they can generate their own custom reports. That user interface allows
people to pick year intervals, day ranges, elevations, categories
(temperature, precipitation, snow depth, etc.), and lat/long perimeter
coordinates (encompassing any number of stations) or a city and radius.

Dave

In response to

pgsql-performance by date

Next:From: Yeb HavingaDate: 2010-05-21 08:38:43
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Previous:From: Matthew WakelingDate: 2010-05-21 03:27:54
Subject: Re: Optimize date query for large child tables: GiST or GIN?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group