Re: Common Table Expressions (WITH RECURSIVE) patch

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: pgsql(at)j-davis(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Common Table Expressions (WITH RECURSIVE) patch
Date: 2008-09-09 04:49:00
Message-ID: 20080909.134900.56565255.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review.

[I dropped y-asaba(at)sraoss(dot)co(dot)jp from the Cc list since he has left our
company and the email address is being deleted.]

I'm going to look into issues which are seem to be bug (of course if
you know what to fix, patches are always welcome:-).

> These are my initial comments about the Common Table Expressions (CTE)
> patch, also known as WITH [RECURSIVE]. These comments are based on the
> patch here:
>
> http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
>
> This is a semantically complex feature, and the standard is fairly
> complex as well. So I'm approaching this by making my own
> interpretations from the standard first (I included my interpretations
> and section references at the end of this email) and comparing to the
> behavior of the patch.
>
> The following examples may be inconsistent with the standard. Some
> have already been mentioned, and I don't think they all need to be
> fixed for 8.4, but I mention them here for completeness.
>
> * Mutual Recursion:
>
> with recursive
> foo(i) as (values(1) union all select i+1 from bar where i < 10),
> bar(i) as (values(1) union all select i+1 from foo where i < 10)
> select * from foo;
> ERROR: mutual recursive call is not supported
>
> The standard allows mutual recursion.

The discussion seems to agree that let it leave for post 8.4.

> * Single Evaluation:
>
> with
> foo(i) as (select random() as i)
> select * from foo union all select * from foo;
> i
> -------------------
> 0.233165248762816
> 0.62126633618027
> (2 rows)
>
> The standard specifies that non-recursive WITH should be evaluated
> once.

What shall we do? I don't think there's an easy way to fix this as Tom
suggested. Maybe we should not allow WITH clause without RECURISVE for
8.4?

> * RHS only:
>
> with recursive
> foo(i) as (select i+1 from foo where i < 10 union all values(1))
> select * from foo;
> ERROR: Left hand side of UNION ALL must be a non-recursive term in a
> recursive query
>
> The standard does not require that the recursive term be on the RHS.

The discussion seems to agree that let it leave for post 8.4.

> * UNION ALL only:
>
> with recursive
> foo(i) as (values(1) union select i+1 from foo where i < 10)
> select * from foo;
> ERROR: non-recursive term and recursive term must be combined with
> UNION ALL
>
> The standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
> (when the recursive term is not on the RHS of the EXCEPT).

The discussion seems to agree that let it leave for post 8.4.

> * Binary recursion and subselect strangeness:
>
> with recursive foo(i) as
> (values (1)
> union all
> select * from
> (select i+1 from foo where i < 10
> union all
> select i+1 from foo where i < X) t)
> select * from foo;
>
> Produces 10 rows of output regardless of what "X" is. This should be
> fixed for 8.4.
> Also, this is non-linear recursion, which the standard seems to
> disallow.

I will try to fix this. However detecting the query being not a
non-linear one is not so easy.

> * Multiple recursive references:
>
> with recursive foo(i) as
> (values (1)
> union all
> select i+1 from foo where i < 10
> union all
> select i+1 from foo where i < 20)
> select * from foo;
> ERROR: Left hand side of UNION ALL must be a non-recursive term in a
> recursive query
>
> If we're going to allow non-linear recursion (which the standard
> does not), this seems like it should be a valid case.

I will try to disallow this.

> * Strange result with except:
>
> with recursive foo(i) as
> (values (1)
> union all
> select * from
> (select i+1 from foo where i < 10
> except
> select i+1 from foo where i < 5) t)
> select * from foo;
> ERROR: table "foo" has 0 columns available but 1 columns specified
>
> This query works if you replace "except" with "union". This should be
> fixed for 8.4.

I will try to fix this.

> * Aggregates allowed:
>
> with recursive foo(i) as
> (values(1)
> union all
> select max(i)+1 from foo where i < 10)
> select * from foo;
>
> Aggregates should be blocked according to the standard.
> Also, causes an infinite loop. This should be fixed for 8.4.

I will try to fix this.

> * DISTINCT should supress duplicates:
>
> with recursive foo(i) as
> (select distinct * from (values(1),(2)) t
> union all
> select distinct i+1 from foo where i < 10)
> select * from foo;
>
> This outputs a lot of duplicates, but they should be supressed
> according to the standard. This query is essentially the same as
> supporting UNION for recursive queries, so we should either fix both for
> 8.4 or block both for consistency.

I'm not sure if it's possible to fix this. Will look into.

> * outer joins on a recursive reference should be blocked:
>
> with recursive foo(i) as
> (values(1)
> union all
> select i+1 from foo left join (values(1)) t on (i=column1))
> select * from foo;
>
> Causes an infinite loop, but the standard says using an outer join
> in this situation should be prohibited. This should be fixed for 8.4.

Not an issue, I think.

> * ORDER BY, LIMIT, and OFFSET are rejected for recursive queries. The
> standard does not seem to say that these should be rejected.
>
>
> The following are my interpretations of relevant parts of the SQL
> standard (200n), and the associated sections. These are only my
> interpretation, so let me know if you interpret the standard
> differently.
>
> Non-linear recursion forbidden:
> 7.13: Syntax Rules: 2.g.ii
> My interpretation of 2.g.ii.2 is that WQN[k] and WQN[l] may be the
> same <query name>.
> 7.13: Syntax Rules: 2.g.iv
>
> EXCEPT can't be used for recursive queries if a recursive reference
> appears on the RHS:
> 7.13: Syntax Rules: 2.g.iii.1
>
> INTERSECT ALL/EXCEPT ALL can't be used for recursive queries:
> 7.13: Syntax Rules: 2.g.iii.5
>
> recursive references must appear in FROM clause:
> 7.13: Syntax Rules: 2.g.iii.3
> My interpretation of this rule is that it does not allow a
> recursive reference in a subquery in the targetlist or a subquery
> in the where clause.
>
> stratum defined:
> 7.13: Syntax Rules: 2.f
> 7.13: Syntax Rules: 2.g.i.1
>
> recursive query must have anchor for every stratum:
> 7.13: Syntax Rules: 2.g.i.3
>
> outer joins not allowed to join recursive references:
> 7.13: Syntax Rules: 2.g.iii.6
> 7.13: Syntax Rules: 2.g.iii.7
>
> Aggregates/HAVING disallowed:
> 7.13: Syntax Rules: 2.g.iii.4.B
>
> Mutual recursion defined:
> 7.13: Syntax Rules: 2.g.i.1
>
> Evaluate each WITH entry once, even if it's referenced multiple times:
> 7.13: General Rules: 1
> 7.13: General Rules: 2.b
> See also:
> http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
>
> Evaluation order with mutual recursion:
> 7.13: General Rules: 2.a
> 7.13: General Rules: 2.b
>
> Evaluation semantics:
> 7.13: General Rules: 2.c
>
> DISTINCT:
> 7.13: General Rules: 2.c.iv
> 7.13: General Rules: 2.c.ix.3.A
>
> I will provide comments about the code and documentation soon. This is a
> very useful feature.

Thanks. Enclosed is the latest patch to adopt CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Attachment Content-Type Size
recursive_query.patch.gz application/octet-stream 31.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-09-09 05:01:12 Re: Common Table Expressions (WITH RECURSIVE) patch
Previous Message Tatsuo Ishii 2008-09-09 04:45:55 Re: Common Table Expressions (WITH RECURSIVE) patch