Re: subselects

From: "Vadim B(dot) Mikheev" <vadim(at)sable(dot)krasnoyarsk(dot)su>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: hackers(at)postgresql(dot)org, "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
Subject: Re: subselects
Date: 1998-01-10 17:19:08
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:
> > No, I don't like to add anything in parser. Example:
> >
> > select *
> > from tabA
> > where col1 = (select col2
> > from tabB
> > where tabA.col3 = tabB.col4
> > and exists (select *
> > from tabC
> > where tabB.colX = tabC.colX and
> > tabC.colY = tabA.col2)
> > )
> >
> > : a column of tabA is referenced in sub-subselect
> This is a strange case that I don't think we need to handle in our first
> implementation.

I don't know is this strange case or not :)
But I would like to know is this allowed by standards - can someone
comment on this ?
And I don't see problems with handling this...

> > (is it allowable by standards ?) - in this case it's better
> > to don't add tabA to 1st subselect but add tabA to second one
> > and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
> > this gives us 2-tables join in 1st subquery instead of 3-tables join.
> > (And I'm still not sure that using temp tables is best of what can be
> > done in all cases...)
> I don't see any use for temp tables in subselects anymore. After having
> implemented UNIONS, I now see how much can be done in the upper
> optimizer. I see you just putting the subquery PLAN into the proper
> place in the plan tree, with some proper JOIN nodes for IN, NOT IN.

When saying about temp tables, I meant tables created by node Material
for subquery plan. This is one of two ways - run subquery once for all
possible upper plan tuples and then just join result table with upper
query. Another way is re-run subquery for each upper query tuple,
without temp table but may be with caching results by some ways.
Actually, there is special case - when subquery can be alternatively
formulated as joins, - but this is just special case.

> > > In the parent query, to parse the WHERE clause, we create a new operator
> > > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
> > ^^^^^^^^^^^^^^^^^^
> > No. We have to handle (a,b,c) OP (select x, y, z ...) and
> > '_a_constant_' OP (select ...) - I don't know is last in standards,
> > Sybase has this.
> I have never seen this in my eight years of SQL. Perhaps we can leave
> this for later, maybe much later.

Are you saying about (a, b, c) or about 'a_constant' ?
Again, can someone comment on are they in standards or not ?
Tom ?
If yes then please add parser' support for them now...

> > Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
> > IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
> > by parser into corresponding ANY and ALL. At the moment we can do:
> >
> > IN --> = ANY, NOT IN --> <> ALL
> >
> > but this will be "known bug": this breaks OO-nature of Postgres, because of
> > operators can be overrided and '=' can mean s o m e t h i n g (not equality).
> > Example: box data type. For boxes, = means equality of _areas_ and =~
> > means that boxes are the same ==> =~ ANY should be used for IN.
> That is interesting, to use =~ for ANY.
> Yes, but how many operators take a SUBQUERY as an operand. This is a
> special case to me.
> I think I see where you are trying to go. You want subselects to behave
> like any other operator, with a subselect type, and you do all the
> subselect handling in the optimizer, with special Nodes and actions.
> I think this may be just too much of a leap. We have such clean query
> logic for single queries, I can't imagine having an operator that has a
> Query operand, and trying to get everything to properly handle it.
> UNIONS were very easy to implement as a List off of Query, with some
> foreach()'s in rewrite and the high optimizer.
> Subselects are SQL standard, and are never going to be over-ridden by a
> user. Same with UNION. They want UNION, they get UNION. They want
> Subselect, we are going to spin through the Query structure and give
> them what they want.
> The complexities of subselects and correlated queries and range tables
> and stuff is so bizarre that trying to get it to work inside the type
> system could be a huge project.

PostgreSQL is a robust, next-generation, Object-Relational DBMS (ORDBMS),
derived from the Berkeley Postgres database management system. While
PostgreSQL retains the powerful object-relational data model, rich data types and
easy extensibility of Postgres, it replaces the PostQuel query language with an
extended subset of SQL.

Should we say users that subselect will work for standard data types only ?
I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
Is there difference between handling = ANY and ~ ANY ? I don't see any.
Currently we can't get IN working properly for boxes (and may be for others too)
and I don't like to try to resolve these problems now, but hope that someday
we'll be able to do this. At the moment - just convert IN into = ANY and
NOT IN into <> ALL in parser.

(BTW, do you know how DISTINCT is implemented ? It doesn't use = but
use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)

> >
> > > right side is an index to a slot in the subqueries List.
> I guess the question is what can we have by February 1?
> I have been reading some postings, and it seems to me that subselects
> are the litmus test for many evaluators when deciding if a database
> engine is full-featured.
> Sorry to be so straightforward, but I want to keep hashing this around
> until we get a conclusion, so coding can start.
> My suggestions have been, I believe, trying to get subselects working
> with the fullest functionality by adding the least amount of code, and
> keeping the logic clean.
> Have you checked out the UNION code? It is very small, but it works. I
> think it could make a good sample for subselects.

There is big difference between subqueries and queries in UNION -
there are not dependences between UNION queries.

Ok, opened issues:

1. Is using upper query' vars in all subquery levels in standard ?
2. Is (a, b, c) OP (subselect) in standard ?
3. What types of expressions (Var, Const, ...) are allowed on the left
side of operator with subquery on the right ?
4. What types of operators should we support (=, >, ..., like, ~, ...) ?
(My vote for all boolean operators).

And - did we get consensus on presentation subqueries stuff in Query,
Expr and Var ?
I would like to have something done in parser near Jan 17 to get
subqueries working by Feb 1. I vote for support of all standard
things (1. - 3.) in parser right now - if there will be no time
to implement something like (a, b, c) then optimizer will call
elog(WARN) (oh, sorry, - elog(ERROR)).


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Vadim B. Mikheev 1998-01-10 17:41:19 Re: [HACKERS] subselects
Previous Message Thomas G. Lockhart 1998-01-10 16:34:25 Re: [HACKERS] grant broken