Common Table Expressions applied; some issues remain

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yoshiyuki Asaba <y-asaba(at)sraoss(dot)co(dot)jp>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Common Table Expressions applied; some issues remain
Date: 2008-10-04 22:22:23
Message-ID: 9623.1223158943@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've applied the latest version of the CTE patch. Congratulations on
making that happen!

There are still some loose ends that need to be considered, though.

1. As committed, the patch takes an extremely hard line about WITH
queries being evaluated independently of the main query and only once
per main query execution. This could be seen as a good thing --- it
provides much more determinism for execution of volatile functions
within complex queries than was really available in the past. It could
also be seen as a bad thing --- in particular, we won't push any
limiting qualifications from the main query into the WITH queries.
So for instance

WITH q AS ( SELECT * FROM foo )
SELECT * FROM q WHERE key = 42;

is going to be executed quite inefficiently; it won't use an index on
foo.key. I think we don't have much choice about this in the case of
recursive WITH queries: it would be pretty difficult to determine
whether pushing a restriction into a recursive WITH would change the
results incorrectly. However, for plain non-recursive WITHs it's all
a matter of definition. I gather from
http://www.oracle-developer.net/display.php?id=212
that Oracle chooses to treat WITH-queries as if they were plain
sub-selects if they're non-recursive and only referenced once.
That is, Oracle would rewrite the above into

SELECT * FROM ( SELECT * FROM foo ) AS q WHERE key = 42;

and then flatten the sub-select and optimize normally. It would
not be hard to make Postgres do the same, but then we would lose
some guarantees about predictable execution of volatile functions.

I'm inclined to think that there is no reason to provide two
different syntaxes to do the same thing, and so having the WITH
syntax behave like this is okay. But it could well result in
performance surprises for people who are used to Oracle.

Any thoughts on what to do? One possibility is to flatten only
if the subquery doesn't contain any volatile functions.

2. The patch didn't touch the implicit-RTE code, which means that

WITH q AS ( SELECT ... )
SELECT q.*

will fail even if you've got add_missing_from enabled. I'm inclined
to think that this violates the principle of least surprise. On
the other hand, add_missing_from is certainly a legacy thing and maybe
we shouldn't bother expending any extra code to make it work with
new features. Thoughts?

3. ruleutils.c's get_name_for_var_field() hasn't implemented the
RTE_CTE case, which means that it doesn't work to reverse-list
examples like this:

explain verbose with qq as (select x from (values(1,2),(3,4)) as x(c1,c2))
select * from (select (x).c2 from qq offset 0) ss;

The reason I let this go is that while poking into it I found out that
get_name_for_var_field is pretty broken already; this fails in HEAD:

explain verbose select (x).c2 from
(select x from (values(1,2),(3,4)) as x(c1,c2) offset 0) ss ;

and this fails even in the back branches:

explain select * from
(select x from (values(1,2),(3,4)) as x(c1,c2) offset 0) ss
where (x).c2 > 0;

It seems we need some redesign in and around EXPLAIN to make that work
nicely, so I figured it would be reasonable to tackle that stuff as a
separate patch.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-10-04 22:39:49 Re: Obsolete functions istrue, isfalse, etc.
Previous Message Peter Eisentraut 2008-10-04 20:44:16 Obsolete functions istrue, isfalse, etc.