Constraint partition index usage

From: Michael Glaesemann <michael(dot)glaesemann(at)myyearbook(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Michael Glaesemann <michael(dot)glaesemann(at)myyearbook(dot)com>
Subject: Constraint partition index usage
Date: 2008-10-14 18:40:53
Message-ID: 6514E4AB-5D33-4B8A-ADF3-BCAB99FAA3A0@myyearbook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We've been looking at breaking out some of our larger logging-type
tables (INSERT only, other than DELETEs for regularly removing old
rows for maintenance) using constraint partitioning to avoid the need
for heavy vacuuming. For some cases it's working well. In others,
we're running into performance issues due to the planner no longer
using indexes for some queries. I've included a couple of examples
where we should have enough information to take advantage of the
available indexes.

Here's the example schema.

=# \d foo.bars
Table "foo.bars"
Column | Type | Modifiers
-----------+--------------------------+------------------------
user_id | integer | not null
bar | ip4 | not null
logged_at | timestamp with time zone | not null default now()
Triggers:
partition_bars BEFORE INSERT ON foo.bars FOR EACH ROW EXECUTE
PROCEDURE foo.partition_bars()

=# \d foo.bars_20081013 -- example of one of 11 tables inheriting from
foo.bars, each holding one week's-worth of data

Table "foo.bars_20081013"
Column | Type | Modifiers
-----------+--------------------------+------------------------
user_id | integer | not null
bar | ip4 | not null
logged_at | timestamp with time zone | not null default now()
Indexes:
"bars_20081013_bar_idx" btree (bar)
"bars_20081013_bar_logged_at_idx" btree (bar, logged_at)
"bars_20081013_logged_at_idx" btree (logged_at)
Check constraints:
"bars_20081013_logged_at_check" CHECK (logged_at >= '2008-10-13
00:00:00-04'::timestamp with time zone AND logged_at < '2008-10-20
00:00:00-04'::timestamp with time zone)
Inherits: foo.bars

Case 1: ORDER BY indexed_column LIMIT some_limit

Depending on the value of some_limit, we should see a win by returning
the top some_limit rows from each inherited table as candidate rows,
and then determine the top some_limit from among the candidates. (Note
in the explain that not all of the inheriting tables have data.)

production=# explain select * from foo.bars order by logged_at desc
limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1023367.34..1023367.34 rows=1 width=16)
-> Sort (cost=1023367.34..1134058.31 rows=44276389 width=16)
Sort Key: foo.bars.logged_at
-> Result (cost=0.00..801985.39 rows=44276389 width=16)
-> Append (cost=0.00..801985.39 rows=44276389
width=16)
-> Seq Scan on bars (cost=0.00..32.70
rows=1770 width=16)
-> Seq Scan on bars_20080915 bars
(cost=0.00..101199.40 rows=5586490 width=16)
-> Seq Scan on bars_20080922 bars
(cost=0.00..215666.84 rows=11907734 width=16)
-> Seq Scan on bars_20080908 bars
(cost=0.00..32.70 rows=1770 width=16)
-> Seq Scan on bars_20080901 bars
(cost=0.00..32.70 rows=1770 width=16)
-> Seq Scan on bars_20080825 bars
(cost=0.00..32.70 rows=1770 width=16)
-> Seq Scan on bars_20080818 bars
(cost=0.00..32.70 rows=1770 width=16)
-> Seq Scan on bars_20080811 bars
(cost=0.00..32.70 rows=1770 width=16)
-> Seq Scan on bars_20080804 bars
(cost=0.00..32.70 rows=1770 width=16)
-> Seq Scan on bars_20080929 bars
(cost=0.00..215029.52 rows=11872652 width=16)
-> Seq Scan on bars_20081006 bars
(cost=0.00..223559.96 rows=12343346 width=16)
-> Seq Scan on bars_20081013 bars
(cost=0.00..46300.77 rows=2553777 width=16)
(17 rows)

production=# explain select * from foo.bars_20080915 order by
logged_at desc limit 1;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.03 rows=1 width=16)
-> Index Scan Backward using bars_20080915_logged_at_idx on
bars_20080915 (cost=0.00..167466.14 rows=5586490 width=16)
(2 rows)

Case 2: Return [min|max](indexed_column)

Again, by finding candidate rows from each inheriting table using the
available index and then determining the min/max from among the
candidates, we should have much better performance than performing seq
scans on each inherting table.

production=# explain select min(logged_at) from foo.bars_20080915;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.03..0.04 rows=1 width=0)
InitPlan
-> Limit (cost=0.00..0.03 rows=1 width=8)
-> Index Scan using bars_20080915_logged_at_idx on
bars_20080915 (cost=0.00..167466.14 rows=5586490 width=8)
Filter: (logged_at IS NOT NULL)
(5 rows)

production=# explain select min(logged_at) from foo.bars;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=913072.22..913072.23 rows=1 width=8)
-> Append (cost=0.00..802333.27 rows=44295577 width=8)
-> Seq Scan on bars (cost=0.00..32.70 rows=1770 width=8)
-> Seq Scan on bars_20080915 bars (cost=0.00..101199.40
rows=5586490 width=8)
-> Seq Scan on bars_20080922 bars (cost=0.00..215666.84
rows=11907734 width=8)
-> Seq Scan on bars_20080908 bars (cost=0.00..32.70
rows=1770 width=8)
-> Seq Scan on bars_20080901 bars (cost=0.00..32.70
rows=1770 width=8)
-> Seq Scan on bars_20080825 bars (cost=0.00..32.70
rows=1770 width=8)
-> Seq Scan on bars_20080818 bars (cost=0.00..32.70
rows=1770 width=8)
-> Seq Scan on bars_20080811 bars (cost=0.00..32.70
rows=1770 width=8)
-> Seq Scan on bars_20080804 bars (cost=0.00..32.70
rows=1770 width=8)
-> Seq Scan on bars_20080929 bars (cost=0.00..215029.52
rows=11872652 width=8)
-> Seq Scan on bars_20081006 bars (cost=0.00..223559.96
rows=12343346 width=8)
-> Seq Scan on bars_20081013 bars (cost=0.00..46648.65
rows=2572965 width=8)
(14 rows)

I thought Greg Stark had looked at some performance tweaks wrt
constraint partitioning, but I can't recall the details. Was that
related to this at all?

Michael Glaesemann
michael(dot)glaesemann(at)myyearbook(dot)com

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-10-14 19:16:30 Re: Updates of SE-PostgreSQL 8.4devel patches
Previous Message Simon Riggs 2008-10-14 17:50:35 Deriving Recovery Snapshots