Re: Increase performance of a UNION query that thakes 655.07 msec to be runned ?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Bruno BAGUETTE" <pgsql-ml(at)baguette(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Increase performance of a UNION query that thakes 655.07 msec to be runned ?
Date: 2004-02-06 15:28:24
Message-ID: 29924.1076081304@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Bruno BAGUETTE" <pgsql-ml(at)baguette(dot)net> writes:
> Do you see a way to get better performances with this query which takes
> currently 655.07 msec to be done.

> levure=> explain analyze SELECT distinct lower(substr(l_name, 1, 1)) AS
> initiale FROM people
> levure-> UNION
> levure-> SELECT distinct lower(substr(org_name, 1, 1)) AS initiale FROM
> organizations
> levure-> ORDER BY initiale;

This is inherently a bit inefficient since the UNION implies a DISTINCT
step, thus partially repeating the DISTINCT work done inside each SELECT.
It would likely be a tad faster to drop the DISTINCTs from the
subselects and rely on UNION to do the filtering. However, you're still
gonna have a big SORT/UNIQUE step.

As of PG 7.4 you could probably get a performance win by converting the
thing to use GROUP BY instead of DISTINCT or UNION:

select initiale from (
select lower(substr(l_name,1,1)) as initiale from people
union all
select lower(substr(org_name,1,1)) as initiale from organizations
) ss
group by initiale order by initiale;

This should use a HashAggregate to do the unique-ification. I think
that will be faster than Sort/Unique.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephan Szabo 2004-02-06 15:52:24 Re: Increase performance of a UNION query that thakes
Previous Message Tom Lane 2004-02-06 14:55:28 Re: Seq scan on zero-parameters function