Unions and where optimisation

From: Boris Klug <boris(dot)klug(at)control(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: Unions and where optimisation
Date: 2003-01-08 13:25:48
Message-ID: 200301081425.48597.boris.klug@control.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello!

I am quite new in the PostgreSQL performance business, done a few years Oracle
stuff before. My ist question is the following:

We have three table, lets name them rk150, 151 and rk152. They all have a
timestamp and a order number in common but than different data after this.
Now I need the data from all tables in one view for a given order number, so
I created a view

create view orderevents as
select ts, aufnr from rk150
union
select ts, aufnr from rk151
union
select ts, aufnr from rk152;

When I does a "select * from orderevents where aufnr='1234'" it takes over 14
seconds!
The problem is now that PostgreSQL first does the union with all the three
tables and after this sorts out the right rows:

Subquery Scan a (cost=54699.06..56622.18 rows=38462 width=20)
-> Unique (cost=54699.06..56622.18 rows=38462 width=20)
-> Sort (cost=54699.06..54699.06 rows=384624 width=20)
-> Append (cost=0.00..10689.24 rows=384624 width=20)
-> Subquery Scan *SELECT* 1
(cost=0.00..8862.52 rows=314852 width=20)
-> Seq Scan on rk150
(cost=0.00..8862.52 rows=314852 width=20)
-> Subquery Scan *SELECT* 2
(cost=0.00..1208.58 rows=45858 width=20)
-> Seq Scan on rk151
(cost=0.00..1208.58 rows=45858 width=20)
-> Subquery Scan *SELECT* 3
(cost=0.00..618.14 rows=23914 width=20)
-> Seq Scan on rk152
(cost=0.00..618.14 rows=23914 width=20)

A better thing would it (Oracle does this and I think I have seen it on
PostgreSQL before), that the where-clause is moved inside every select so we
have something like this (written by hand):

select * from (
select zeit, aufnr from rk150 where aufnr='13153811'
union
select zeit, aufnr from rk151 where aufnr='13153811'
union
select zeit, aufnr from rk152 where aufnr='13153811')
as A;

This takes less than 1 second because the nr of rows that have to be joined
are only 45 (optimizer expects 4), not > 300.000:

Subquery Scan a (cost=45.97..46.19 rows=4 width=20)
-> Unique (cost=45.97..46.19 rows=4 width=20)
-> Sort (cost=45.97..45.97 rows=45 width=20)
-> Append (cost=0.00..44.74 rows=45 width=20)
-> Subquery Scan *SELECT* 1
(cost=0.00..32.22 rows=31 width=20)
-> Index Scan using rk150_uidx_aufnr on rk150
(cost=0.00..32.22 rows=31 width=20)
-> Subquery Scan *SELECT* 2
(cost=0.00..7.67 rows=9 width=20)
-> Index Scan using rk151_uidx_aufnr on rk151
(cost=0.00..7.67 rows=9 width=20)
-> Subquery Scan *SELECT* 3
(cost=0.00..4.85 rows=5 width=20)
-> Index Scan using rk152_uidx_aufnr on rk152
(cost=0.00..4.85 rows=5 width=20)

My question now: Is the optimizer able to move the where clause into unions?
If so, how I can get him to do it?

Thank you for the help in advance!

--
Dipl. Inform. Boris Klug, control IT GmbH, Germany

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomasz Myrta 2003-01-08 14:32:11 Re: Unions and where optimisation
Previous Message scott.marlowe 2003-01-07 23:16:09 Re: [PERFORM] PostgreSQL and memory usage