Skip site navigation (1) Skip section navigation (2)

Re: Speed difference between select ... union select ... and select from partitioned_table

From: Pablo Alcaraz <pabloa(at)laotraesquina(dot)com(dot)ar>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speed difference between select ... union select ... and select from partitioned_table
Date: 2007-10-27 05:14:21
Message-ID: 4722C92D.8050501@laotraesquina.com.ar (view raw or flat)
Thread:
Lists: pgsql-performance
These are the EXPLAIN ANALIZE:

I ran both queries on a CLUSTER and ANALYZEd tables:

UNION QUERY
explain analyze
select e, p,  sum( c) as c
from (
        select e, p, count( *) as c
        from tt_00003
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00006
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00009
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00012
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00015
        group by e, p
) as t
group by e, p
order by e, p desc;

"Sort  (cost=2549202.87..2549203.37 rows=200 width=16) (actual 
time=263593.182..263593.429 rows=207 loops=1)"
"  Sort Key: t.e, t.p"
"  ->  HashAggregate  (cost=2549192.73..2549195.23 rows=200 width=16) 
(actual time=263592.469..263592.763 rows=207 loops=1)"
"        ->  Unique  (cost=2549172.54..2549179.88 rows=734 width=8) 
(actual time=263590.481..263591.764 rows=356 loops=1)"
"              ->  Sort  (cost=2549172.54..2549174.38 rows=734 width=8) 
(actual time=263590.479..263590.891 rows=356 loops=1)"
"                    Sort Key: e, p, c"
"                    ->  Append  (cost=1307131.88..2549137.60 rows=734 
width=8) (actual time=132862.176..263589.774 rows=356 loops=1)"
"                          ->  HashAggregate  
(cost=1307131.88..1307133.03 rows=92 width=8) (actual 
time=132862.173..132862.483 rows=200 loops=1)"
"                                ->  Seq Scan on tt_00003  
(cost=0.00..1081550.36 rows=30077536 width=8) (actual 
time=10.135..83957.424 rows=30000000 loops=1)"
"                          ->  HashAggregate  
(cost=1241915.64..1241916.16 rows=42 width=8) (actual 
time=130726.219..130726.457 rows=156 loops=1)"
"                                ->  Seq Scan on tt_00006  
(cost=0.00..1028793.22 rows=28416322 width=8) (actual 
time=11.389..87338.730 rows=28351293 loops=1)"
"                          ->  HashAggregate  (cost=24.53..27.03 
rows=200 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"                                ->  Seq Scan on tt_00009  
(cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 
loops=1)"
"                          ->  HashAggregate  (cost=24.53..27.03 
rows=200 width=8) (actual time=0.004..0.004 rows=0 loops=1)"
"                                ->  Seq Scan on tt_00012  
(cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 
loops=1)"
"                          ->  HashAggregate  (cost=24.53..27.03 
rows=200 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"                                ->  Seq Scan on tt_00015  
(cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 
loops=1)"
"Total runtime: 263594.381 ms"


PARTITIONED QUERY

explain analyze
select e, p,  count( *) as c
from tt
group by e, p
order by e, p desc;

"GroupAggregate  (cost=13256958.67..13842471.95 rows=40000 width=8) 
(actual time=899391.384..1065585.531 rows=207 loops=1)"
"  ->  Sort  (cost=13256958.67..13403211.99 rows=58501328 width=8) 
(actual time=899391.364..989749.914 rows=58351293 loops=1)"
"        Sort Key: public.tt.e, public.tt.p"
"        ->  Append  (cost=0.00..2110508.28 rows=58501328 width=8) 
(actual time=14.031..485211.466 rows=58351293 loops=1)"
"              ->  Seq Scan on tt  (cost=0.00..18.30 rows=830 width=8) 
(actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00003 tt  (cost=0.00..1081550.36 
rows=30077536 width=8) (actual time=14.024..178657.738 rows=30000000 
loops=1)"
"              ->  Seq Scan on tt_00006 tt  (cost=0.00..1028793.22 
rows=28416322 width=8) (actual time=39.852..168307.030 rows=28351293 
loops=1)"
"              ->  Seq Scan on tt_00009 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00012 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00015 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00018 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00021 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00024 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00027 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00030 tt  (cost=0.00..18.30 rows=830 
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"Total runtime: 1066301.084 ms"

Any idea?

Regards

Pablo


Jeff Davis wrote:
> On Fri, 2007-10-26 at 16:37 -0400, Pablo Alcaraz wrote:
>   
>> Hi List!
>>
>> I executed 2 equivalents queries. The first one uses a union structure. 
>> The second uses a partitioned table. The tables are the same with 30 
>> millions of rows each one and the returned rows are the same.
>>
>> But the union query perform faster than the partitioned query.
>>
>>     
>
> I think you mean to use UNION ALL here. UNION forces a DISTINCT, which
> results in a sort operation. What surprises me is that the UNION is
> actually faster than the partitioning using inheritance.
>
> I suspect it has something to do with the GROUP BYs, but we won't know
> until you post EXPLAIN ANALYZE results.
>
>     Regards,
> 	Jeff Davis
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
>   

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2007-10-27 05:37:29
Subject: Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1
Previous:From: AntonDate: 2007-10-27 04:26:21
Subject: Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group