From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>, Nicolas Fertig <nfertig(at)swissnow(dot)ch>, pgsql-sql(at)postgresql(dot)org |
Subject: | Re: OUTER JOIN with filter |
Date: | 2003-03-03 19:17:19 |
Message-ID: | 15484.1046719039@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Josh Berkus <josh(at)agliodbs(dot)com> writes:
> For that matter, in a few queries I've found that it pays to force the
> planner's hand by repeating some of the clauses in the WHERE clause in the
> JOIN as well, as:
> SELECT a.x, b.y, c.z
> FROM a JOIN b ON (a.id = b.a_id AND b.type = 'm')
> LEFT OUTER JOIN c ON (b.id = c.b_id)
> WHERE b.type = 'm'
> The logic is fine. The issue comes where the outer joined table is several
> times larger than the main queried table. The planner should recognize the
> possibility of filtering the records in the joined table before joining in
> order to minimize the join operation.
AFAIR, it's always done that; certainly the above query looks quite
redundant to me. Note though that pushing down the WHERE clause is only
legal because b is on the outer side of the join.
A comparable query in the 7.3 regression database is
regression=# explain select * from
regression-# tenk1 a join tenk1 b on (a.unique1 = b.unique1)
regression-# left join tenk1 c on (b.unique2 = c.unique2)
regression-# where b.thousand = 999;
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..572.49 rows=10 width=732)
-> Nested Loop (cost=0.00..542.55 rows=10 width=488)
-> Seq Scan on tenk1 b (cost=0.00..483.00 rows=10 width=244)
Filter: (thousand = 999)
-> Index Scan using tenk1_unique1 on tenk1 a (cost=0.00..6.01 rows=1 width=244)
Index Cond: (a.unique1 = "outer".unique1)
-> Index Scan using tenk1_unique2 on tenk1 c (cost=0.00..3.01 rows=1 width=244)
Index Cond: ("outer".unique2 = c.unique2)
(8 rows)
You can see that the condition on b.thousand does get pushed down to the
bottom scan. (There's no index on b.thousand, so we don't get an
indexscan --- but we would if there were an index...)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2003-03-03 20:46:17 | Re: HardCORE QUERY HELP!!! |
Previous Message | Chris | 2003-03-03 18:54:34 | LIMIT and SUBQUERIES |