Skip site navigation (1) Skip section navigation (2)

Partition table query performance

From: "Greg Jaman" <gjaman(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Partition table query performance
Date: 2008-11-26 04:07:46
Message-ID: (view raw or flat)
Lists: pgsql-performance
I have a problem with partitioning and I'm wondering if anyone can provide
some insight.   I'm trying to find the max value of a column across multiple
partitions.  The query against the partition set is quite slow while queries
against child partitions is very fast!

I setup a basic Range Partition table definition.
   A parent table:  Data {  dataID, sensorID, value, ts }
   child tables   Data_YYYY_WEEKNO { dataID, sensorID, value, ts}  inherited
from Data
          Each child tables has a primary key index on dataID and a
composite index on (sensorID, ts).
          Each child has check constraints for the week range identified in
the table name (non overlapping)

I want to perform a simple operation:  select the max ts (timestamp) giving
a sensorID.  Given my indexs on the table, this should be a simple and fast

DB=# EXPLAIN ANALYZE  select max(ts) from "Data" where valid=true and
"sensorID"=8293 ;
 Aggregate  (cost=334862.92..334862.93 rows=1 width=8) (actual
time=85183.381..85183.383 rows=1 loops=1)
   ->  Append  (cost=2.30..329397.68 rows=2186096 width=8) (actual
time=1.263..76592.755 rows=2205408 loops=1)
         ->  Bitmap Heap Scan on "Data"  (cost=2.30..8.84 rows=3 width=8)
(actual time=0.027..0.027 rows=0 loops=1)
               Recheck Cond: ("sensorID" = 8293)
               Filter: valid
               ->  Bitmap Index Scan on "def_data_sensorID_ts"
(cost=0.00..2.30 rows=6 width=0) (actual time=0.021..0.021 rows=0 loops=1)
                     Index Cond: ("sensorID" = 8293)
         ->  *Index Scan using "Data_2008_01_sensorID_ts_index" on
"Data_2008_01" "Data"*  (cost=0.00..4.27 rows=1 width=8) (actual
time=0.014..0.014 rows=0 loops=1)
               Index Cond: ("sensorID" = 8293)
               Filter: valid
         ->  *Bitmap Heap Scan on "Data_2008_02" "Data"*  (cost=3.01..121.08
rows=98 width=8) (actual time=0.017..0.017 rows=0 loops=1)
               Recheck Cond: ("sensorID" = 8293)
               Filter: valid
               ->  Bitmap Index Scan on "Data_2008_02_sensorID_ts_index"
(cost=0.00..2.99 rows=98 width=0) (actual time=0.011..0.011 rows=0 loops=1)
                     Index Cond: ("sensorID" = 8293)
. (omitted a list of all partitions with same as data above)
 Total runtime: 85188.694 ms

When I query against a specific partition:

DB=# EXPLAIN ANALYZE  select max(ts) from "Data_2008_48" where valid=true
and "sensorID"=8293 ;


 Result  (cost=0.10..0.11 rows=1 width=0) (actual time=3.830..3.832 rows=1
     ->  Limit  (cost=0.00..0.10 rows=1 width=8) (actual time=3.817..3.819
rows=1 loops=1)
           ->  Index Scan Backward using "Data_2008_48_sensorID_ts_index" on
"Data_2008_48"  (cost=0.00..15304.55 rows=148959 width=8) (actual
time=3.813..3.813 rows=1 loops=1)
                 Index Cond: ("sensorID" = 8293)
                 Filter: ((ts IS NOT NULL) AND valid)
 Total runtime: 0.225 ms

The query plan against the child partition makes sense - Uses the index to
find the max value.  The query plan for the partitions uses a combination of
bitmap heap scans and index scans.
Why would the query plan choose to use a bitmap heap scan after bitmap index
scan or is that the best choice?  (what is it doing?) and what can I do to
speed up this query?

As a sanity check I did a union query of all partitions to find the max(ts).
My manual union query executed in 13ms vs the query against the parent table
that was 85,188ms!!!.

Greg Jaman


pgsql-performance by date

Next:From: Richard HuxtonDate: 2008-11-26 10:41:12
Subject: Re: Increasing pattern index query speed
Previous:From: Mark KirkwoodDate: 2008-11-25 19:37:57
Subject: Re: Monitoring buffercache...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group