Re: a JOIN on same table, but 'slided over'

From: Rafal Pietrak <rafal(at)zorro(dot)isa-geek(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: hubert depesz lubaczewski <depesz(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: a JOIN on same table, but 'slided over'
Date: 2007-06-28 08:44:46
Message-ID: 1183020286.28091.95.camel@zorro.isa-geek.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Gurjeet,

Focusing on the standars solution, I did some 'exercises' - works fine,
just learning.

But the ambarasing thing is, that I looks like I really don't get it,
meaning - what exactly the internal query does. I've never ever seen or
used a subquery with data/params from 'upper level' query used within a
subquery - any time I've written a hierarchical query (e.g. with
subqueries), the relations were always hierarchical. In other words, I
was always able to run an internal subquery outside of the compound
query and get consistant results. With this one I cannot do that due to
the 'entanglement' of t3 and t1.

Postgress query plan from EXPLAIN doesn't help me here - probably I'm
unable to interpret it correctly without 'a paradigm mind shift'.

So, would you mind commenting a little on how exactly the t1.id
influences subquery (with t3), and the result influences back the
selection of t1 set?

Will greatly apreciate that.

-R

On Tue, 2007-06-26 at 19:14 +0530, Gurjeet Singh wrote:
> I missed the ORDER BY clause... Here it goes:
>
> select t1.id as id, t2.id as "id+1",
> t1.thread as thread, t2.thread as "thread+1",
> t1.info as info, t2.info as "info+1"
> from test as t1, test as t2
> where t2.id = ( select min(id) from test as t3 where t3.id > t1.id )
> order by t1.id asc;
>
> Also note that this query is much cheaper that the 'distinct on' query
> by more than two orders on magnitude ( 217.86 vs. 98040.67):
>
> postgres=# explain
> postgres-# select
> postgres-# distinct on (t1.id)
> postgres-# t1.*, t2.*
> postgres-# from
> postgres-# test t1
> postgres-# join test t2 on t2.id > t1.id
> postgres-# order by t1.id asc, t2.id asc;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------
> Unique (cost=95798.00..98040.67 rows=1160 width=80)
> -> Sort (cost=95798.00..96919.33 rows=448533 width=80)
> Sort Key: t1.id, t2.id
> -> Nested Loop (cost=0.00..13827.29 rows=448533 width=80)
> -> Seq Scan on test t1 (cost=0.00..21.60 rows=1160
> width=40)
> -> Index Scan using test_id_key on test t2
> (cost=0.00..7.06 rows=387 width=40)
> Index Cond: (t2.id > t1.id)
> (7 rows)
> Time: 5.003 ms
> postgres=# explain
> postgres-# select t1.id as id, t2.id as "id+1",
> postgres-# t1.thread as thread, t2.thread as "thread+1",
> postgres-# t1.info as info, t2.info as "info+1"
> postgres-# from test as t1, test as t2
> postgres-# where t2.id = ( select min(id) from test as t3 where t3.id
> > t1.id )
> postgres-# order by t1.id asc;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------
> Sort (cost=214.96..217.86 rows=1160 width=80)
> Sort Key: t1.id
> -> Hash Join (cost= 36.10..155.92 rows=1160 width=80)
> Hash Cond: ((subplan) = t2.id)
> -> Seq Scan on test t1 (cost=0.00..21.60 rows=1160
> width=40)
> -> Hash (cost=21.60..21.60 rows=1160 width=40)
> -> Seq Scan on test t2 (cost=0.00..21.60 rows=1160
> width=40)
> SubPlan
> -> Result (cost=0.13..0.14 rows=1 width=0)
> InitPlan
> -> Limit (cost= 0.00..0.13 rows=1 width=4)
> -> Index Scan using test_id_key on test t3
> (cost=0.00..51.02 rows=387 width=4)
> Index Cond: (id > $0)
> Filter: (id IS NOT NULL)
> (14 rows)
> Time: 4.125 ms
>
>
> Best regards,
> --
> gurjeet[(dot)singh](at)EnterpriseDB(dot)com
> singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com
>
> 17°29'34.37"N 78°30'59.76"E - Hyderabad *
> 18°32'57.25"N 73°56'25.42"E - Pune
>
> Sent from my BlackLaptop device
>
> On 6/26/07, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com > wrote:
> Hi Rafal,
>
> Just a note that this is not standard SQL... 'distinct on'
> is an extension to SQL provided by postgres.
>
> Following query utilizes the standard SQL to get the same
> results:
>
> select t1.id as id, t2.id as "id+1",
> t1.thread as thread, t2.thread as "thread+1",
> t1.info as info, t2.info as "info+1"
> from test as t1, test as t2
> where t2.id = ( select min(id) from test as t3 where t3.id >
> t1.id);
>
> HTH
> --
> gurjeet[(dot)singh](at)EnterpriseDB(dot)com
> singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com
>
> 17°29'34.37"N 78°30'59.76"E - Hyderabad *
> 18°32'57.25"N 73°56' 25.42 "E - Pune
>
> Sent from my BlackLaptop device
>
>
> On 6/26/07, Rafal Pietrak <rafal(at)zorro(dot)isa-geek(dot)com> wrote:
> Marvelous! Thenx!
>
> -R
>
> On Tue, 2007-06-26 at 10:06 +0200, hubert depesz
> lubaczewski wrote:
> > On 6/26/07, Rafal Pietrak <rafal(at)zorro(dot)isa-geek(dot)com>
> wrote:
> > Is there an SQL construct to get it?
> >
> > select
> > distinct on (t1.id)
> > t1.*, t2.*
> > from
> > test t1
> > join test t2 on t2.id > t1.id
> > order by t1.id asc, t2.id asc
> >
> > should do the trick.
> >
> > depesz
> >
> > --
> > http://www.depesz.com/ - nowy, lepszy depesz
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org/
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce McAlister 2007-06-28 09:33:58 Re: AutoVacuum Behaviour Question
Previous Message Björn Lundin 2007-06-28 08:20:12 using PREPAREd statements in CURSOR