Inefficient min/max against partition (ver 9.1.1)

From: "McGehee, Robert" <Robert(dot)McGehee(at)geodecapital(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Inefficient min/max against partition (ver 9.1.1)
Date: 2012-02-29 17:32:32
Message-ID: 17B09E7789D3104E8F5EEB0582A8D66FBAA3FCEDC0@MSGRTPCCRF2WIN.DMN1.FMR.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On PostgreSQL 9.1.1, I'm experiencing extremely slow/inefficient min/max queries against a partitioned table, despite the recent improvements made in version 9.1. I haven't seen this issue discussed since 9.1 was released, so I wanted to provide an example of the inefficient execution plan in case this is not a known issue with the new version.

In my case, the query analyzer chooses the wrong index to scan of the child table when the query is made against the parent table. The tables are partitioned by 'fctid'. The query 'SELECT max(date) FROM table WHERE fctid=301 and sec_id=1' correctly uses the index (sec_id, date) when querying against the child table (0.1ms), but when run against the parent table, the planner chooses to scan the (date, sec_id) primary key instead, resulting in a full table scan in some instances (49 minutes!).

In my example the parent case is empty and all child tables have non-overlapping check constraints. Below is the schema and execution plans.

Let me know if you need anything else. Thanks, Robert

Parent table schema:
template1=# \d f_data
Table "public.f_data"
Column | Type | Modifiers
--------+----------+-----------
sec_id | integer | not null
date | date | not null
fctid | smallint | not null
value | real | not null
Indexes:
"f_data_pkey" PRIMARY KEY, btree (fctid, date, sec_id)
Triggers:
insert_f_data_trigger BEFORE INSERT ON f_data FOR EACH ROW EXECUTE PROCEDURE f_data_insert_trigger()
Number of child tables: 7 (Use \d+ to list them.)

Child table schema:
template1=# \d f_data301
Table "public.f_data301"
Column | Type | Modifiers
--------+----------+-----------
sec_id | integer | not null
date | date | not null
fctid | smallint | not null
value | real | not null
Indexes:
"pk_f_data_rsi2" PRIMARY KEY, btree (date, sec_id) CLUSTER
"f_data_rsi2_idx" btree (sec_id, date)
Check constraints:
"f_data_rsi2_fctid_check" CHECK (fctid = 301)
Inherits: f_data

template1=# EXPLAIN ANALYZE SELECT max(date) FROM f_data301 WHERE fctid=301 and sec_id=1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Result (cost=1.84..1.85 rows=1 width=0) (actual time=0.077..0.078 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..1.84 rows=1 width=4) (actual time=0.074..0.074 rows=0 loops=1)
-> Index Scan Backward using f_data_rsi2_idx on f_data301 (cost=0.00..6370.59 rows=3465 width=4) (a
Index Cond: ((sec_id = 1) AND (date IS NOT NULL))
Filter: (fctid = 301)
Total runtime: 0.132 ms
(7 rows)

template1=# EXPLAIN ANALYZE SELECT max(date) FROM f_data where fctid=301 and sec_id=1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Result (cost=522.10..522.11 rows=1 width=0) (actual time=2921439.560..2921439.561 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.02..522.10 rows=1 width=4) (actual time=2921439.554..2921439.554 rows=0 loops=1)
-> Merge Append (cost=0.02..1809543.34 rows=3466 width=4) (actual time=2921439.551..2921439.551 row
Sort Key: public.f_data.date
-> Sort (cost=0.01..0.02 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=1)
Sort Key: public.f_data.date
Sort Method: quicksort Memory: 25kB
-> Seq Scan on f_data (cost=0.00..0.00 rows=1 width=4) (actual time=0.002..0.002 rows=0
Filter: ((date IS NOT NULL) AND (fctid = 301) AND (sec_id = 1))
-> Index Scan Backward using pk_f_data_rsi2 on f_data301 f_data (cost=0.00..1809499.99 rows=3
Index Cond: ((date IS NOT NULL) AND (sec_id = 1))
Filter: (fctid = 301)
Total runtime: 2921439.645 ms
(14 rows)

template1=# select version();
version
---------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit

Robert McGehee, CFA
Geode Capital Management, LLC
One Post Office Square, 28th Floor | Boston, MA | 02109
Direct: (617)392-8396

This e-mail, and any attachments hereto, are intended for use by the addressee(s) only and may contain information that is (i) confidential information of Geode Capital Management, LLC and/or its affiliates, and/or (ii) proprietary information of Geode Capital Management, LLC and/or its affiliates. If you are not the intended recipient of this e-mail, or if you have otherwise received this e-mail in error, please immediately notify me by telephone (you may call collect), or by e-mail, and please permanently delete the original, any print outs and any copies of the foregoing. Any dissemination, distribution or copying of this e-mail is strictly prohibited.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2012-02-29 18:32:48 Re: text search: tablescan cost for a tsvector
Previous Message Stefan Keller 2012-02-29 15:28:25 Re: PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?