Re: Early WIP/PoC for inlining CTEs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: David Fetter <david(at)fetter(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Early WIP/PoC for inlining CTEs
Date: 2018-11-16 21:15:10
Message-ID: 8810.1542402910@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andreas Karlsson <andreas(at)proxel(dot)se> writes:
> [ inlining-ctes-v5.patch ]

I took a little bit of a look through this. Some thoughts:

* I think it'd be a good idea if we made OFFSET/LIMIT in a CTE be an
alternate way of keeping it from being inlined. As the patch stands,
if that's the behavior you want, you have no way to express it in
a query that will also work in older servers. (I will manfully
resist suggesting that then we don't need the nonstandard syntax
at all ... oops, too late.)

* This is not the idiomatic way to declare an expression tree walker:

+static bool inline_cte_walker(Node *node, void *ctxp)
+{
+ struct inline_cte_walker_ctx *ctx = ctxp;

* I have no faith in the idea that we can skip doing a copyObject on the
inlined subquery, except maybe in the case where we know there's exactly
one reference.

* In "here's where we do the actual substitution", if we're going to
scribble on the RTE rather than make a new one, we must take pains
to zero out the RTE_CTE-specific fields so that the RTE looks the
same as if it had been a RTE_SUBQUERY all along; cf db1071d4e.

* The lack of comments about what conditions we inline under
(at subselect.c:1318) is distressing. I'm not particularly
in love with the way that inline_cte_walker is commented, either.
And dare I mention that this falsifies the intro comment for
SS_process_ctes?

* Speaking of the comments, I'm not convinced that view rewrite is
a good comparison point; I think this is more like subquery pullup.

* I wonder whether we should make range_table_walker more friendly
to the needs of this patch. The fact that it doesn't work for this usage
suggests that it might not work for others, too. I could see replacing
the QTW_EXAMINE_RTES flag with QTW_EXAMINE_RTES_BEFORE and
QTW_EXAMINE_RTES_AFTER so that callers could say which order of operations
they want (ie, visit RTE itself before or after its substructure). Then
we could get rid of the double traversal of the RTE list.

* I think a large fraction of the regression test cases that this
changes are actually broken by the patch, in the sense that we meant
to test the behavior with a CTE and now we're not getting that.
So we'd need to add MATERIALIZED in many more places than this has
done. Somebody else (Stephen?) would need to opine on whether that's
true for the CTEs in rowsecurity.sql, but it's definitely true for
the one in rowtypes.sql, where the point is to test what happens
with a whole-row Var.

* Which will mean we need some new test cases showing that this patch does
anything. It'd be a good idea to add a test case showing that this gets
things right for conflicting CTE names at different levels, eg

explain verbose
with x as (select 1 as y)
select * from (with x as (select 2 as y) select * from x) ss;

* ruleutils.c needs adjustments for the new syntax, if we keep that.

* And of course the documentation needs much more work than this.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-11-16 21:43:07 Re: [HACKERS] pgbench - allow to store select results into variables
Previous Message David Fetter 2018-11-16 20:26:50 Another limitation of built-in logical replication