Skip site navigation (1) Skip section navigation (2)

Issue with partition elimination

From: Paul Pierce <paulpierce70(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Issue with partition elimination
Date: 2011-04-22 01:26:18
Message-ID: 505432.93484.qm@web120209.mail.ne1.yahoo.com (view raw or flat)
Thread:
Lists: pgsql-performance
      
This is a good one :)

Here is a "brief" description of our issue(Postgres 9.0): 

Tables:
main fact table:
Table "public.parent_fact"
        Column        |            Type             |  
----------------------+-----------------------------+-----------
 etime                | date | not null
 pcamp_id             | integer                     | 
 location_id          | integer                     | 
 impressions          | bigint                      | 
 clicks              | int

this table partitioned by etime.

We are trying to build a report, which has last week numbers alongside with this 
week numbers. For example: if today is Wednesday, I want to compare daily 
numbers from last week 3 days (mon through wed) with this week 3 days(mon 
through wed).

To accomplish that, we've decided to build a transformation table, which has two 
columns:

 Table "public.trans_last_week"
  Column  |            Type             | Modifiers 
----------+-----------------------------+-----------
 etime    | date | 
 lw_etime | date |

So for each date(etime), we have lw_etime, which is essentially etime-7 days.

Here is the first query, which performs fine:

select    a11.location_id AS location_id,
    a11.pcamp_id AS  pcamp_id,
    sum(a11.clicks)
from    parent_fact    a11
where    a11.etime between '2011-14-18' and '2011-04-20'
group  by    a11.location_id,
    a11.pcamp_id

everything is good there -  it calculates numbers from the current week and goes 
to only 3 partitions to aggregate numbers. 

Here is the second query:

select    a11.location_id AS location_id,
    a11.pcamp_id AS  pcamp_id,
    sum(a11.clicks)
from    parent_fact    a11
    join    trans_last_week    a12
      on     (a11.etime = a12.lw_etime)
where    a12.etime between '2011-14-18' and '2011-04-20'
group  by    a11.location_id,
    a11.pcamp_id


Here it scans through all partitions in the parent_fact table and runs 3-4 times 
slower.

What was noticed, that the only case when Postgres is actually going to execute 
the query against the right partitions is query #1. 

Is that by design? Second query join, will also result in 3 days(3 partitions) 

This query (#3) also scans all partitions:

select    a11.location_id AS location_id,
    a11.pcamp_id AS  pcamp_id,
    sum(a11.clicks)
from    parent_fact    a11
where    a11.etime in (select a12.etime from trans_last_week    a12 
where a11.etime = a12.lw_etime)
group  by    a11.location_id,
    a11.pcamp_id


Thank you!

Responses

pgsql-performance by date

Next:From: drvilloDate: 2011-04-22 09:21:35
Subject: Checkpoint execution overrun impact?
Previous:From: Merlin MoncureDate: 2011-04-21 20:12:37
Subject: Re: oom_killer

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group