Issue with partition elimination

From: Paul Pierce <paulpierce70(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Issue with partition elimination
Date: 2011-04-22 01:26:18
Message-ID: 505432.93484.qm@web120209.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


This is a good one :)

Here is a "brief" description of our issue(Postgres 9.0):

Tables:
main fact table:
Table "public.parent_fact"
Column | Type |
----------------------+-----------------------------+-----------
etime | date | not null
pcamp_id | integer |
location_id | integer |
impressions | bigint |
clicks | int

this table partitioned by etime.

We are trying to build a report, which has last week numbers alongside with this
week numbers. For example: if today is Wednesday, I want to compare daily
numbers from last week 3 days (mon through wed) with this week 3 days(mon
through wed).

To accomplish that, we've decided to build a transformation table, which has two
columns:

Table "public.trans_last_week"
Column | Type | Modifiers
----------+-----------------------------+-----------
etime | date |
lw_etime | date |

So for each date(etime), we have lw_etime, which is essentially etime-7 days.

Here is the first query, which performs fine:

select a11.location_id AS location_id,
a11.pcamp_id AS pcamp_id,
sum(a11.clicks)
from parent_fact a11
where a11.etime between '2011-14-18' and '2011-04-20'
group by a11.location_id,
a11.pcamp_id

everything is good there - it calculates numbers from the current week and goes
to only 3 partitions to aggregate numbers.

Here is the second query:

select a11.location_id AS location_id,
a11.pcamp_id AS pcamp_id,
sum(a11.clicks)
from parent_fact a11
join trans_last_week a12
on (a11.etime = a12.lw_etime)
where a12.etime between '2011-14-18' and '2011-04-20'
group by a11.location_id,
a11.pcamp_id

Here it scans through all partitions in the parent_fact table and runs 3-4 times
slower.

What was noticed, that the only case when Postgres is actually going to execute
the query against the right partitions is query #1.

Is that by design? Second query join, will also result in 3 days(3 partitions)

This query (#3) also scans all partitions:

select a11.location_id AS location_id,
a11.pcamp_id AS pcamp_id,
sum(a11.clicks)
from parent_fact a11
where a11.etime in (select a12.etime from trans_last_week a12
where a11.etime = a12.lw_etime)
group by a11.location_id,
a11.pcamp_id

Thank you!

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message drvillo 2011-04-22 09:21:35 Checkpoint execution overrun impact?
Previous Message Merlin Moncure 2011-04-21 20:12:37 Re: oom_killer