Re: partition queries hitting all partitions even though check key is specified

From: Kevin Kempter <kevink(at)consistentstate(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: partition queries hitting all partitions even though check key is specified
Date: 2009-09-02 15:39:02
Message-ID: 200909020939.02869.kevink@consistentstate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wednesday 02 September 2009 09:19:20 Tom Lane wrote:
> Kevin Kempter <kevink(at)consistentstate(dot)com> writes:
> > I cant figure out why we're scanning all of our partitions.
>
> The example works as expected for me:
>
> regression=# CREATE TABLE url_hits (
> id integer NOT NULL,
> content_type_id integer,
> file_extension_id integer,
> "time" integer,
> bytes integer NOT NULL,
> path_id integer);
> CREATE TABLE
> regression=# create table url_hits_2011_12 (
> check (
> "time" >= extract ('epoch' from timestamp '2011-12-01
> 00:00:00')::int4
> and "time" <= extract ('epoch' from timestamp '2011-12-31
> 23:59:59')::int4
> )
> ) INHERITS (url_hits);
> CREATE TABLE
> regression=# create table url_hits_2009_08 (
> check (
> "time" >= extract ('epoch' from timestamp '2009-08-01
> 00:00:00')::int4
> and "time" <= extract ('epoch' from timestamp '2009-08-31
> 23:59:59')::int4
> )
> ) INHERITS (url_hits);
> CREATE TABLE
> regression=# explain select * from url_hits where "time" <
> date_part('epoch'::text, '2009-08-12'::timestamp without time
> zone)::integer; QUERY PLAN
> ---------------------------------------------------------------------------
>-------------- Result (cost=0.00..82.50 rows=1401 width=24)
> -> Append (cost=0.00..82.50 rows=1401 width=24)
> -> Seq Scan on url_hits (cost=0.00..27.50 rows=467 width=24)
> Filter: ("time" < 1250049600)
> -> Seq Scan on url_hits_2011_12 url_hits (cost=0.00..27.50
> rows=467 width=24) Filter: ("time" < 1250049600)
> -> Seq Scan on url_hits_2009_08 url_hits (cost=0.00..27.50
> rows=467 width=24) Filter: ("time" < 1250049600)
> (8 rows)
>
> regression=# set constraint_exclusion TO 1;
> SET
> regression=# explain select * from url_hits where "time" <
> date_part('epoch'::text, '2009-08-12'::timestamp without time
> zone)::integer; QUERY PLAN
> ---------------------------------------------------------------------------
>-------------- Result (cost=0.00..55.00 rows=934 width=24)
> -> Append (cost=0.00..55.00 rows=934 width=24)
> -> Seq Scan on url_hits (cost=0.00..27.50 rows=467 width=24)
> Filter: ("time" < 1250049600)
> -> Seq Scan on url_hits_2009_08 url_hits (cost=0.00..27.50
> rows=467 width=24) Filter: ("time" < 1250049600)
> (6 rows)
>
>
> You sure you remembered those fiddly little casts everywhere?
> (Frankly, declaring "time" as integer and not timestamp here strikes
> me as utter lunacy.) What PG version are you using?
>
> regards, tom lane

I actually inherited the whole "time" scenario - agreed, its crazy.

In any case I ran the exact same query as you and it still scans most (but not
all) partitions. Were on version


pwreport=# set constraint_exclusion TO 1;SET
pwreport=#
explain select * from pwreport.url_hits where "time" <
date_part('epoch'::text, '2009-08-12'::timestamp without time zone)::integer;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..9677473.91 rows=148258840 width=432)
-> Append (cost=0.00..9677473.91 rows=148258840 width=432)
-> Seq Scan on url_hits (cost=0.00..12.12 rows=57 width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_07 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_06 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_05 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_04 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_03 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_02 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2009_01 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2008_12 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2008_11 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2008_10 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Seq Scan on url_hits_2008_09 url_hits (cost=0.00..12.12 rows=57
width=432)
Filter: ("time" < 1250035200)
-> Index Scan using url_hits_2009_08_time_index on url_hits_2009_08
url_hits (cost=0.00..9677328.41 rows=148258156 width=131)
Index Cond: ("time" < 1250035200)
(28 rows)

> id integer NOT NULL,
> content_type_id integer,
> file_extension_id integer,
> "time" integer,
> bytes integer NOT NULL,
> path_id integer);

Also, we do have indexes on the child table, will this change things?

\d url_hits_2009_08
Table "url_hits_2009_08"
Column | Type |
Modifiers
-------------------+-----------------------+----------------------------------------------------------------
id | integer | not null default
nextval('url_hits_id_seq'::regclass)
direction | proxy_direction_enum | not null
content_type_id | integer |
file_extension_id | integer |
time | integer |
bytes | integer | not null
path_id | integer |
Indexes:
"url_hits_2009_08_pk" PRIMARY KEY, btree (id)
"url_hits_2009_08_time_index" btree ("time")
Check constraints:
"url_hits_2009_08_time_check" CHECK ("time" >= date_part('epoch'::text,
'2009-08-01 00:00:00'::timestamp without time zone)::integer AND "time" <=
date_part('epoch'::text, '2009-08-31 23:59:59'::timestamp without time
zone)::integer)
Inherits: url_hits
Tablespace: "pwreport_1000"

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Wei Zheng 2009-09-02 15:39:25 Looking for real configuration data
Previous Message Pat Chan 2009-09-02 15:29:14 pg_stat_activity.current_query explanation?