|From:||Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>|
|To:||robertmhaas(at)gmail(dot)com (Robert Haas), Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org|
|Views:||Raw Message | Whole Thread | Download mbox|
>>>>> "Robert" == Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Just to pick up on some points from the discussion:
>> 1. LATERAL has to be explicit because it changes the scope of
>> references. For example, in:
>> ... (select ... FROM (select a AS b), (select b)) ...
>> the "b" in the second subselect could be an outer reference, but
>> it can't be a reference to the first subquery; whereas in:
>> ... (select ... FROM (select a AS b), LATERAL (select b)) ...
>> the "b" in the second subselect refers to the result of the first
Robert> Can you provide a more complete example? I'm unable to
Robert> construct a working example of this type. For example:
Robert> rhaas=# select (select 1 from (select a as b) x, (select b) y) from t1;
Robert> ERROR: subquery in FROM cannot refer to other relations of same query
Robert> level at character 50
That looks like a bug to me. The spec is explicit that the inner definition
of b is not in scope in the second subquery, and therefore that should parse
as an outer reference.
>> 2. LATERAL in general constrains both the join order and the join
>> plan, assuming any lateral references are actually made.
Robert> Peter seemed to be saying that LATERAL() must syntactically
Robert> follow the same-level FROM items to which it refers. Is that
Robert> your understanding also?
LATERAL references must be to items defined in the same FROM clause and
to the left of the LATERAL.
The relevant language of the spec seems to be:
a) If TR is contained in a <from clause> FC with no intervening
<query expression>, then the scope clause SC of TR is the <select
statement: single row> or innermost <query specification> that
contains FC. The scope of a range variable of TR is the <select
list>, <where clause>, <group by clause>, <having clause>, and
<window clause> of SC, together with every <lateral derived table>
that is simply contained in FC and is preceded by TR, and every
<collection derived table> that is simply contained in FC and is
preceded by TR, and the <join condition> of all <joined table>s
contained in SC that contain TR. If SC is the <query specification>
that is the <query expression body> of a simple table query STQ,
then the scope of a range variable of TR also includes the <order
by clause> of STQ.
>> 4. LATERAL allows some optimizations that aren't currently done, either
>> by explicitly rewriting the query, or (in theory) the optimizer itself
>> could consider a lateral plan (I believe Oracle does this). This would
>> apply to queries of this form:
>> SELECT ... FROM t1 LEFT JOIN (t2 JOIN t3 ON (t2.a=t3.a)) on (t1.a=t2.a);
>> which currently forces the t2/t3 join to occur first even where t1 is
>> small; this could be rewritten with LATERAL as:
>> SELECT ...
>> FROM t1
>> LEFT JOIN LATERAL (select * from t2 join t3 on (t2.a=t3.a)
>> where t2.a=t1.a) s
>> ON true;
Robert> Well, you haven't actually commuted the joins here - how do
Robert> you have in mind for PostgreSQL to execute this? I'm
Robert> guessing that it's something like a nest loop with t1 as the
Robert> outer side and the lateral subquery as the inner side, so
Robert> that the executor repeatedly executes
Robert> "select * from t2 join t3 on t2.a = t3.a where t2.a = $1"?
The current execution plans for this type of query are completely
disastrous if t1 is small (or qualified so as to be small) and t2 and
t3 are large. Having LATERAL would allow the query to be rewritten to
perform reasonably; a bonus would be for the planner to consider the
lateral join automatically without requiring it to be explicitly
|Next Message||Andrew Dunstan||2009-09-10 03:27:24||Re: Ragged CSV import|
|Previous Message||Robert Haas||2009-09-10 03:19:07||Re: Ragged CSV import|