Re: Optimising Union Query.

From: Rob Kirkbride <rob(dot)kirkbride(at)thales-is(dot)com>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Optimising Union Query.
Date: 2005-04-25 13:05:53
Message-ID: 426CEB31.8070806@thales-is.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Jim C. Nasby wrote on 25/04/2005 01:28:

>On Sat, Apr 23, 2005 at 10:39:14PM +0000, Patrick TJ McPhee wrote:
>
>
>>In article <4268F322(dot)1040106(at)thales-is(dot)com>,
>>Rob Kirkbride <rob(dot)kirkbride(at)thales-is(dot)com> wrote:
>>
>>% I've done a explain analyze and as I expected the database has to check
>>% every row in each of the three tables below but I'm wondering if I can
>>
>>This is because you're returning a row for every row in the three
>>tables.
>>
>>% select l.name,l.id from pa i,locations l where i.location=l.id union
>>% select l.name,l.id from andu i,locations l where i.location=l.id union
>>% select l.name,l.id from idu i,locations l where i.location=l.id;
>>
>>You might get some improvement from
>>
>> select name,id from locations
>> where id in (select distinct location from pa union
>> select distinct location from andu union
>> select distinct location from idu);
>>
>>
>
>Note that SELECT DISTINCT is redundant with a plain UNION. By
>definition, UNION does a DISTINCT. In fact, this is going to hurt you;
>you'll end up doing 4 distinct operations (one for each SELECT DISTINCT
>and one for the overall UNION). Unless some of those tables have a lot
>of duplicated location values, you should either use UNION ALLs or drop
>the DISTINCTs. Note that going with DISTINCTs is different than what
>your original query does.
>
>You should also consider this:
>
>SELECT name, id FROM locations l
> WHERE EXISTS (SELECT * FROM pa p WHERE p.location=l.id)
> OR EXISTS (SELECT * FROM andu a WHERE a.location=l.id)
> OR EXISTS (SELECT * FROM idu i WHERE i.location=l.id)
>
>This query would definately be helped by having indexes on
>(pa|andu|idu).location.
>
>

Thanks for that. I tried a few things, including using DISTINCTS and
UNION ALLs but none made a big difference.
However your query above sped things up by a factor of more than 2.

Thanks very much!

Rob

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Chris Kratz 2005-04-25 13:50:29 Hosting Service Recommendations
Previous Message K.RajaSekar 2005-04-25 10:07:53 Installation problem with the version 8.0.2