Skip site navigation (1) Skip section navigation (2)

Re: Issue with partition elimination

From: Paul Pierce <paulpierce70(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Issue with partition elimination
Date: 2011-04-22 01:26:18
Message-ID: 505432.93484.qm@web120209.mail.ne1.yahoo.com (view raw)
      
This is a good one :)

Here is a "brief" description of our issue(Postgres 9.0): 

Tables:
main fact table:
Table "public.parent_fact"
        Column        |            Type             |  
----------------------+-----------------------------+-----------
 etime                | date | not null
 pcamp_id             | integer                     | 
 location_id          | integer                     | 
 impressions          | bigint                      | 
 clicks              | int

this table partitioned by etime.

We are trying to build a report, which has last week numbers alongside with this 
week numbers. For example: if today is Wednesday, I want to compare daily 
numbers from last week 3 days (mon through wed) with this week 3 days(mon 
through wed).

To accomplish that, we've decided to build a transformation table, which has two 
columns:

 Table "public.trans_last_week"
  Column  |            Type             | Modifiers 
----------+-----------------------------+-----------
 etime    | date | 
 lw_etime | date |

So for each date(etime), we have lw_etime, which is essentially etime-7 days.

Here is the first query, which performs fine:

select    a11.location_id AS location_id,
    a11.pcamp_id AS  pcamp_id,
    sum(a11.clicks)
from    parent_fact    a11
where    a11.etime between '2011-14-18' and '2011-04-20'
group  by    a11.location_id,
    a11.pcamp_id

everything is good there -  it calculates numbers from the current week and goes 
to only 3 partitions to aggregate numbers. 

Here is the second query:

select    a11.location_id AS location_id,
    a11.pcamp_id AS  pcamp_id,
    sum(a11.clicks)
from    parent_fact    a11
    join    trans_last_week    a12
      on     (a11.etime = a12.lw_etime)
where    a12.etime between '2011-14-18' and '2011-04-20'
group  by    a11.location_id,
    a11.pcamp_id


Here it scans through all partitions in the parent_fact table and runs 3-4 times 
slower.

What was noticed, that the only case when Postgres is actually going to execute 
the query against the right partitions is query #1. 

Is that by design? Second query join, will also result in 3 days(3 partitions) 

This query (#3) also scans all partitions:

select    a11.location_id AS location_id,
    a11.pcamp_id AS  pcamp_id,
    sum(a11.clicks)
from    parent_fact    a11
where    a11.etime in (select a12.etime from trans_last_week    a12 
where a11.etime = a12.lw_etime)
group  by    a11.location_id,
    a11.pcamp_id


Thank you!
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Issue with partition elimination
Date: 2011-04-25 18:06:47
Message-ID: 4DB5B837.2010602@agliodbs.com (view raw)
On 4/21/11 6:26 PM, Paul Pierce wrote:
> What was noticed, that the only case when Postgres is actually going to execute 
> the query against the right partitions is query #1. 
> 
> Is that by design? Second query join, will also result in 3 days(3 partitions) 

Partition elimination currently can only handle constants and
expressions which are equivalent to constants.  It will not filter on
Joins successfully.

This will improve somewhat in 9.1, possibly enough to fix your case.
Please test this on 9.1a5 and see how well it works, and give us feedback.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Privacy Policy | About PostgreSQL
Copyright © 1996-2013 The PostgreSQL Global Development Group