Re: slow UNIONing

From: Barry Lind <barry(at)xythos(dot)com>
To: Kovacs Zoltan <kovacsz(at)pc10(dot)radnoti-szeged(dot)sulinet(dot)hu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: slow UNIONing
Date: 2001-09-19 02:30:26
Message-ID: 3BA80342.4000305@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kovacs,

A 'union all' will be much faster than 'union'. 'union all' returns all
results from both queries, whereas 'union' will return all distinct
records. The 'union' requires a sort and a merge to remove the
duplicate values. Below are explain output for a union query and a
union all query.

files=# explain
files-# select dummy from test
files-# union all
files-# select dummy from test;
NOTICE: QUERY PLAN:

Append (cost=0.00..40.00 rows=2000 width=12)
-> Subquery Scan *SELECT* 1 (cost=0.00..20.00 rows=1000 width=12)
-> Seq Scan on test (cost=0.00..20.00 rows=1000 width=12)
-> Subquery Scan *SELECT* 2 (cost=0.00..20.00 rows=1000 width=12)
-> Seq Scan on test (cost=0.00..20.00 rows=1000 width=12)

EXPLAIN
files=# explain
files-# select dummy from test
files-# union
files-# select dummy from test;
NOTICE: QUERY PLAN:

Unique (cost=149.66..154.66 rows=200 width=12)
-> Sort (cost=149.66..149.66 rows=2000 width=12)
-> Append (cost=0.00..40.00 rows=2000 width=12)
-> Subquery Scan *SELECT* 1 (cost=0.00..20.00 rows=1000
width=12)
-> Seq Scan on test (cost=0.00..20.00 rows=1000
width=12)
-> Subquery Scan *SELECT* 2 (cost=0.00..20.00 rows=1000
width=12)
-> Seq Scan on test (cost=0.00..20.00 rows=1000
width=12)

EXPLAIN
files=#

thanks,
--Barry

Kovacs Zoltan wrote:

> I experienced that UNIONs in 7.1.1 are rather slow:
>
> tir=# explain (select nev from cikk) union (select tevekenyseg from log);
> NOTICE: QUERY PLAN:
>
> Unique (cost=667.63..687.18 rows=782 width=12)
> -> Sort (cost=667.63..667.63 rows=7817 width=12)
> -> Append (cost=0.00..162.17 rows=7817 width=12)
> -> Subquery Scan *SELECT* 1 (cost=0.00..28.16 rows=1316 width=12)
> -> Seq Scan on cikk (cost=0.00..28.16 rows=1316 width=12)
> -> Subquery Scan *SELECT* 2 (cost=0.00..134.01 rows=6501 width=12)
> -> Seq Scan on log (cost=0.00..134.01 rows=6501 width=12)
>
> Of course a simple SELECT is fast:
>
> tir=# explain select nev from cikk;
> NOTICE: QUERY PLAN:
>
> Seq Scan on cikk (cost=0.00..28.16 rows=1316 width=12)
>
>
> For me it seems to be slow due to the sorting. Is this right?
> Is this normal at all? Is it possible to make it faster?
>
> TIA, Zoltan
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2001-09-19 03:07:32 Re: Beta time
Previous Message Martín Marqués 2001-09-18 21:22:15 Re: Putting timestamps in PostgreSQL log