Skip site navigation (1) Skip section navigation (2)

Re: Partitioning Option?

From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Partitioning Option?
Date: 2005-02-25 00:44:38
Message-ID: 421E74F6.5000603@bigfoot.com (view raw or flat)
Thread:
Lists: pgsql-admin
Tom Lane wrote:
> "Tomeh, Husam" <htomeh(at)firstam(dot)com> writes:
> 
>>(I was referring to object partitioning. For instance, if I have a huge
>>table with US counties as my partition key, I could create partitions
>>within the same table based on the partition key (a US county for
>>example). When querying, the engine will access the partition instead of
>>the whole table to get the result set. This is provided in Oracle DB EE.
>>So, I was wondering whether I can do similar thing in PostgreSQL since
>>we're exploring PostgreSQL)
> 
> 
> You can build it out of spare parts: either a view over a UNION ALL of
> component tables, or a parent table with a bunch of inheritance
> children, either way with rules to redirect insertions into the
> right subtable.  (With the inheritance way you could instead use
> a trigger for that, which'd likely be more flexible.)

Tom, I did a post on performance about my attempt to do an horizontal partition,
in a 7.4.x engine, but it seems the planner refuse to optimize it,
look at this for example:


CREATE TABLE user_logs_2003_h () inherits (user_logs);
CREATE TABLE user_logs_2002_h () inherits (user_logs);

I defined on these tables the index already defined on user_logs.

And this is the result:

empdb=# explain analyze select * from user_logs where id_user = sp_id_user('kalman');
                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..426.33 rows=335 width=67) (actual time=20.891..129.218 rows=98 loops=1)
   ->  Append  (cost=0.00..426.33 rows=335 width=67) (actual time=20.871..128.643 rows=98 loops=1)
         ->  Index Scan using idx_user_user_logs on user_logs  (cost=0.00..133.11 rows=66 width=67) (actual time=20.864..44.594 rows=3 loops=1)
               Index Cond: (id_user = 4185)
         ->  Index Scan using idx_user_user_logs_2003_h on user_logs_2003_h user_logs  (cost=0.00..204.39 rows=189 width=67) (actual time=1.507..83.662 rows=95 loops=1)
               Index Cond: (id_user = 4185)
         ->  Index Scan using idx_user_user_logs_2002_h on user_logs_2002_h user_logs  (cost=0.00..88.83 rows=80 width=67) (actual time=0.206..0.206 rows=0 loops=1)
               Index Cond: (id_user = 4185)
 Total runtime: 129.500 ms
(9 rows)

that is good, but now look what happen in a view like this one ( where I join the view above ):


create view to_delete AS
SELECT v.login,
       u.*
from  user_login v,
      user_logs u
where v.id_user = u.id_user;



empdb=# explain analyze select * from to_delete where login = 'kalman';
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=4.01..65421.05 rows=143 width=79) (actual time=1479.738..37121.511 rows=98 loops=1)
   Hash Cond: ("outer".id_user = "inner".id_user)
   ->  Append  (cost=0.00..50793.17 rows=2924633 width=67) (actual time=21.391..33987.363 rows=2927428 loops=1)
         ->  Seq Scan on user_logs u  (cost=0.00..7195.22 rows=411244 width=67) (actual time=21.385..5641.307 rows=414039 loops=1)
         ->  Seq Scan on user_logs_2003_h u  (cost=0.00..34833.95 rows=2008190 width=67) (actual time=0.024..18031.218 rows=2008190 loops=1)
         ->  Seq Scan on user_logs_2002_h u  (cost=0.00..8764.00 rows=505199 width=67) (actual time=0.005..5733.554 rows=505199 loops=1)
   ->  Hash  (cost=4.00..4.00 rows=2 width=16) (actual time=0.195..0.195 rows=0 loops=1)
         ->  Index Scan using user_login_login_key on user_login v  (cost=0.00..4.00 rows=2 width=16) (actual time=0.155..0.161 rows=1 loops=1)
               Index Cond: ((login)::text = 'kalman'::text)
 Total runtime: 37122.069 ms
(10 rows)



I did a similar attempt with UNION ALL but the result is the same.



Regards
Gaetano Mendola









In response to

pgsql-admin by date

Next:From: Bruce MomjianDate: 2005-02-25 01:15:41
Subject: Re: Statistics Collector & Query Optimizer, any relation?
Previous:From: Gaetano MendolaDate: 2005-02-25 00:40:08
Subject: Re: Partitioning Option?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group