Quick Links

Re: Query much faster with enable_seqscan=0

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Samuel Gendler <sgendler(at)ideasculptor(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ogden <lists(at)darkstatic(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Query much faster with enable_seqscan=0
Date:	2010-10-28 16:21:01
Message-ID:	AANLkTi=cvPjxDLJHpJ82F=zXtGqXRjb7mtvUod-vx6ZC@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Tue, Oct 12, 2010 at 10:28 PM, Samuel Gendler
<sgendler(at)ideasculptor(dot)com> wrote:
>
>
> On Tue, Sep 21, 2010 at 4:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Ogden <lists(at)darkstatic(dot)com> writes:
>> > SELECT tr.id, tr.sid
>> > FROM
>> > test_registration tr,
>> > INNER JOIN test_registration_result r on (tr.id =
>> > r.test_registration_id)
>> > WHERE.
>> >
>> > tr.test_administration_id='32a22b12-aa21-11df-a606-96551e8f4e4c'::uuid
>> > GROUP BY tr.id, tr.sid
>>
>> Seeing that tr.id is a primary key, I think you might be a lot better
>> off if you avoided the inner join and group by. I think what you really
>> want here is something like
>>
>> SELECT tr.id, tr.sid
>> FROM
>> test_registration tr
>> WHERE
>>
>> tr.test_administration_id='32a22b12-aa21-11df-a606-96551e8f4e4c'::uuid
>> AND EXISTS(SELECT 1 FROM test_registration_result r
>> WHERE tr.id = r.test_registration_id)
>>
>> regards, tom lane
>>
>
> Could you explain the logic behind why this structure is better than the
> other? Is it always the case that one should just always use the
> 'exists(select 1 from x...)' structure when trying to strip rows that don't
> join or is it just the case when you know that the rows which do join are a
> fairly limited subset? Does the same advantage exist if filtering rows in
> the joined table on some criteria, or is it better at that point to use an
> inner join and add a where clause to filter the joined rows.
> select table1.columns
> from table1, table2
> where table1.column = 'some_value'
>    and table1.fk = table2.pk
>   AND table2.column = 'some_other_value'
> versus
> select table1.columns
>   from table1
> where table1.column = 'some_value'
>    and exists(select 1 from table2 where table1.fk = table2.pk
>    and table2.column ='some_other_value')

I don't think there's much difference between those two cases. I
think Tom's point was that GROUP BY can be expensive - which it
certainly can. It's absolutely necessary and unavoidable for certain
queries, of course, but don't include it unless you need it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Query much faster with enable_seqscan=0 at 2010-10-13 02:28:55 from Samuel Gendler

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Ben	2010-10-28 16:36:54	partitioning question 1
Previous Message	Robert Haas	2010-10-28 16:18:31	Re: Slow Query- Simple taking