From: | "Vadim B(dot) Mikheev" <vadim(at)sable(dot)krasnoyarsk(dot)su> |
---|---|
To: | Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us> |
Cc: | hackers(at)postgresql(dot)org |
Subject: | Re: [HACKERS] subselect |
Date: | 1998-01-05 19:55:57 |
Message-ID: | 34B13ACD.B1A95805@sable.krasnoyarsk.su |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Bruce Momjian wrote:
>
> > always inner one in this case. The same for handling ALL modifier.
> > Note, that we generaly can't use aggregates here: we can't add MAX to
> > subquery in the case of > ALL (subquery), because of > ALL should return FALSE
> > if subquery returns NULL(s) but aggregates don't take NULLs into account.
>
> OK, here are my ideas. First, I think you have to handle subselects in
> the outer node because a subquery could have its own subquery. Also, we
I hope that this is no matter: if results of subquery (with/without sub-subqueries)
will go into temp table then this table will be re-scanned for each outer tuple.
> now have a field in Aggreg to all us to 'usenulls'.
^^^^^^^^
This can't help:
vac=> select * from x;
y
-
1
2
3
<<< this is NULL
(4 rows)
vac=> select max(y) from x;
max
---
3
==> we can't replace
select * from A where A.a > ALL (select y from x);
^^^^^^^^^^^^^^^
(NULL will be returned and so A.a > ALL is FALSE - this is what
Sybase does, is it right ?)
with
select * from A where A.a > (select max(y) from x);
^^^^^^^^^^^^^^^^^^^^
just because of we lose knowledge about NULLs here.
Also, I would like to handle ANY and ALL modifiers for all bool
operators, either built-in or user-defined, for all data types -
isn't PostgreSQL OO-like RDBMS -:)
> OK, here it is. I recommend we pass the outer and subquery through
> the parser and optimizer separately.
I don't like this. I would like to get parse-tree from parser for
entire query and let optimizer (on upper level) decide how to rewrite
parse-tree and what plans to produce and how these plans should be
merged. Note, that I don't object your methods below, but only where
to place handling of this. I don't understand why should we add
new part to the system which will do optimizer' work (parse-tree -->
execution plan) and deal with optimizer nodes. Imho, upper optimizer
level is nice place to do this.
>
> We parse the subquery first. If the subquery is not correlated, it
> should parse fine. If it is correlated, any columns we find in the
> subquery that are not already in the FROM list, we add the table to the
> subquery FROM list, and add the referenced column to the target list of
> the subquery.
>
> When we are finished parsing the subquery, we create a catalog cache
> entry for it called 'sub1' and make its fields match the target
> list of the subquery.
>
> In the outer query, we add 'sub1' to its target list, and change
> the subquery reference to point to the new range table. We also add
> WHERE clauses to do any correlated joins.
...
> Here is a more complex correlated subquery:
>
> select *
> from taba
> where col1 = (select col2
> from tabb
> where taba.col3 = tabb.col4)
>
> Here we must add 'taba' to the subquery's FROM list, and add col3 to the
> target list of the subquery. After we parse the subquery, add 'sub1' to
> the FROM list of the outer query, change 'col1 = (subquery)' to 'col1 =
> sub1.col2', and add to the outer WHERE clause 'AND taba.col3 = sub1.col3'.
> THe optimizer will do the correlation for us.
>
> In the optimizer, we can parse the subquery first, then the outer query,
> and then replace all 'sub1' references in the outer query to use the
> subquery plan.
>
> I realize making merging the two plans and doing IN and NOT IN is the
^^^^^^^^^^^^^^^^^^^^^
This is very easy to do! As I already said we have just change sub1
access path (SeqScan of sub1) with SeqScan of Material node with
subquery plan.
> real challenge, but I hoped this would give us a start.
Decision about how to record subquery stuff in to parse-tree
would be very good start -:)
BTW, note that for _expression_ subqueries (which are introduced without
IN, EXISTS, ALL, ANY - this follows Sybase' naming) - as in your examples -
we have to check that subquery returns single tuple...
Vadim
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 1998-01-05 20:51:26 | Re: [HACKERS] subselect |
Previous Message | Constantin Teodorescu | 1998-01-05 18:43:45 | I want to change libpq and libpgtcl for better handling of large query results |