Re: Optimize date query for large child tables: GiST or GIN?

From: David Jarvis <thangalin(at)gmail(dot)com>
To: Matthew Wakeling <mnw21(at)cam(dot)ac(dot)uk>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Date: 2010-05-20 20:19:25
Message-ID: AANLkTik4bMwdV6C9Xyv1R1RfrV4OltRGwExGcdZZyEHI@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I was still referring to the measurement table. You have an index on
> stationid, but still seem to be getting a sequential scan. Maybe the planner
> does not realise that you are selecting a small number of stations. Posting
> an EXPLAIN ANALYSE would really help here.
>

Here is the result from an *EXPLAIN ANALYZE*:

"HashAggregate (cost=5486752.27..5486756.27 rows=200 width=12) (actual
time=314328.657..314328.728 rows=110 loops=1)"
" -> Hash Semi Join (cost=1045.52..5451155.11 rows=4746289 width=12)
(actual time=197.950..313605.795 rows=463926 loops=1)"
" Hash Cond: (m.station_id = s.id)"
" -> Append (cost=0.00..5343318.08 rows=4746289 width=16) (actual
time=74.411..306533.820 rows=42737997 loops=1)"
" -> Seq Scan on measurement m (cost=0.00..148.00 rows=1
width=20) (actual time=0.001..0.001 rows=0 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_01_001 m (cost=0.00..438102.26
rows=389080 width=16) (actual time=74.409..24800.171 rows=3503256 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_02_001 m (cost=0.00..399834.28
rows=354646 width=16) (actual time=29.217..22209.877 rows=3196631 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_03_001 m (cost=0.00..438380.23
rows=389148 width=16) (actual time=15.915..24366.766 rows=3503937 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_04_001 m (cost=0.00..432850.57
rows=384539 width=16) (actual time=15.852..24280.031 rows=3461931 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_05_001 m (cost=0.00..466852.96
rows=415704 width=16) (actual time=19.495..26158.828 rows=3737276 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_06_001 m (cost=0.00..458098.05
rows=407244 width=16) (actual time=25.062..26054.019 rows=3668108 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_07_001 m (cost=0.00..472679.60
rows=420736 width=16) (actual time=17.852..26829.286 rows=3784626 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_08_001 m (cost=0.00..471200.02
rows=418722 width=16) (actual time=20.781..26875.574 rows=3772848 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_09_001 m (cost=0.00..447468.05
rows=397415 width=16) (actual time=17.454..25355.688 rows=3580395 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_10_001 m (cost=0.00..449691.17
rows=399362 width=16) (actual time=17.911..25144.829 rows=3594957 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_11_001 m (cost=0.00..429363.73
rows=380826 width=16) (actual time=18.944..24106.477 rows=3430085 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Seq Scan on measurement_12_001 m (cost=0.00..438649.19
rows=388866 width=16) (actual time=22.830..24466.324 rows=3503947 loops=1)"
" Filter: ((category_id = 1) AND (date_part('year'::text,
(taken)::timestamp without time zone) >= 1900::double precision) AND
(date_part('year'::text, (taken)::timestamp without time zone) <=
2009::double precision) AND (taken >= (((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-01-01'::text))::date) AND
(taken <= ((((date_part('year'::text, (taken)::timestamp without time zone)
+ GREATEST(((-1)::double precision * sign((((((date_part('year'::text,
(taken)::timestamp without time zone))::text || '-12-31'::text))::date -
(((date_part('year'::text, (taken)::timestamp without time zone))::text ||
'-01-01'::text))::date))::double precision)), 0::double precision)))::text
|| '-12-31'::text))::date))"
" -> Hash (cost=994.94..994.94 rows=4046 width=4) (actual
time=120.793..120.793 rows=129 loops=1)"
" -> Nested Loop (cost=0.00..994.94 rows=4046 width=4)
(actual time=71.112..120.728 rows=129 loops=1)"
" Join Filter: ((6371.009::double precision *
sqrt((pow(radians(((c.latitude_decimal - s.latitude_decimal))::double
precision), 2::double precision) + (cos((radians(((c.latitude_decimal +
s.latitude_decimal))::double precision) / 2::double precision)) *
pow(radians(((c.longitude_decimal - s.longitude_decimal))::double
precision), 2::double precision))))) <= 50::double precision)"
" -> Index Scan using city_pkey1 on city c
(cost=0.00..6.27 rows=1 width=16) (actual time=61.311..61.314 rows=1
loops=1)"
" Index Cond: (id = 5182)"
" -> Seq Scan on station s (cost=0.00..321.08
rows=12138 width=20) (actual time=9.745..19.035 rows=12139 loops=1)"
" Filter: ((s.elevation >= 0) AND (s.elevation <=
3000))"
"Total runtime: 314329.201 ms"

Dave

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Jarvis 2010-05-20 20:30:48 Re: Optimize date query for large child tables: GiST or GIN?
Previous Message Tom Lane 2010-05-20 20:18:58 Re: Optimize date query for large child tables: GiST or GIN?