Re: Common Table Expressions applied; some issues remain

From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Yoshiyuki Asaba <y-asaba(at)sraoss(dot)co(dot)jp>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Common Table Expressions applied; some issues remain
Date: 2009-05-22 15:59:03
Message-ID: 4136ffa0905220859i74168759n8237b103cbbed369@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(quoting more than usual to provide context because this is such an old thread)

On Sat, Oct 4, 2008 at 11:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I've applied the latest version of the CTE patch.  Congratulations on
> making that happen!
>
> There are still some loose ends that need to be considered, though.
>
> 1. As committed, the patch takes an extremely hard line about WITH
> queries being evaluated independently of the main query and only once
> per main query execution.  This could be seen as a good thing --- it
> provides much more determinism for execution of volatile functions
> within complex queries than was really available in the past.  It could
> also be seen as a bad thing --- in particular, we won't push any
> limiting qualifications from the main query into the WITH queries.
> So for instance
>
>        WITH q AS ( SELECT * FROM foo )
>        SELECT * FROM q WHERE key = 42;
>
> is going to be executed quite inefficiently; it won't use an index on
> foo.key.  I think we don't have much choice about this in the case of
> recursive WITH queries: it would be pretty difficult to determine
> whether pushing a restriction into a recursive WITH would change the
> results incorrectly.  However, for plain non-recursive WITHs it's all
> a matter of definition.  I gather from
> http://www.oracle-developer.net/display.php?id=212
> that Oracle chooses to treat WITH-queries as if they were plain
> sub-selects if they're non-recursive and only referenced once.
> That is, Oracle would rewrite the above into
>
>        SELECT * FROM ( SELECT * FROM foo ) AS q WHERE key = 42;
>
> and then flatten the sub-select and optimize normally.  It would
> not be hard to make Postgres do the same, but then we would lose
> some guarantees about predictable execution of volatile functions.
>
> I'm inclined to think that there is no reason to provide two
> different syntaxes to do the same thing, and so having the WITH
> syntax behave like this is okay.  But it could well result in
> performance surprises for people who are used to Oracle.
>
> Any thoughts on what to do?  One possibility is to flatten only
> if the subquery doesn't contain any volatile functions.
>

One possibility would be to not flatten the query but find these quals
and copy them onto the cte when planning the cte. So we would still
materialize the result and avoid duplicate execution but only fetch
the records which we know a caller will need. We could even do that
for multiple callers if we join their quals with an OR -- that still
might allow a bitmap index scan.

I'm not sure we will work out with the order of in which the various
phases of analysis are done on the outer query compared to the
subquery.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-05-22 16:11:46 Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE
Previous Message Tom Lane 2009-05-22 15:54:21 Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE