Re: Early WIP/PoC for inlining CTEs

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Early WIP/PoC for inlining CTEs
Date: 2018-12-31 23:46:57
Message-ID: d4daf011-01cf-6bf9-3f15-85ae1572e909@proxel.se
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 11/16/18 10:15 PM, Tom Lane wrote:
> I took a little bit of a look through this. Some thoughts:

Thanks for the review! I have decided to pick up this patch and work on
it since nothing has happened in a while. Here is a new version with
most of the feedback fixed.

> * This is not the idiomatic way to declare an expression tree walker:
>
> +static bool inline_cte_walker(Node *node, void *ctxp)
> +{
> + struct inline_cte_walker_ctx *ctx = ctxp;
>
> * I have no faith in the idea that we can skip doing a copyObject on the
> inlined subquery, except maybe in the case where we know there's exactly
> one reference.
>
> * In "here's where we do the actual substitution", if we're going to
> scribble on the RTE rather than make a new one, we must take pains
> to zero out the RTE_CTE-specific fields so that the RTE looks the
> same as if it had been a RTE_SUBQUERY all along; cf db1071d4e.
>
> * The lack of comments about what conditions we inline under
> (at subselect.c:1318) is distressing. I'm not particularly
> in love with the way that inline_cte_walker is commented, either.
> And dare I mention that this falsifies the intro comment for
> SS_process_ctes?
>
> * Speaking of the comments, I'm not convinced that view rewrite is
> a good comparison point; I think this is more like subquery pullup.

I believe I have fixed these except for the comment on the conditions
for when we inline.

Andrew Gierth: Why did you chose to not inline on FOR UPDATE but inline
volatile functions? I feel that this might be inconsistent since in both
cases the query in the CTE can change behavior if the planner pushes a
WHERE clause into the subquery, but maybe I am missing something.

> * I wonder whether we should make range_table_walker more friendly
> to the needs of this patch. The fact that it doesn't work for this usage
> suggests that it might not work for others, too. I could see replacing
> the QTW_EXAMINE_RTES flag with QTW_EXAMINE_RTES_BEFORE and
> QTW_EXAMINE_RTES_AFTER so that callers could say which order of operations
> they want (ie, visit RTE itself before or after its substructure). Then
> we could get rid of the double traversal of the RTE list.

I did as suggested and the code is now much cleaner, but I feel while
RTE walking business would become cleaner if we could change from having
the range table walker yield both RTEs and the contents of the RTEs to
having it just yeild the RTEs and then the walker callback can call
expression_tree_walker() with the RTE so RTEs are treated like any other
node in the tree.

I might look into how big impact such a change would have and if it is
worth the churn.

> * I think a large fraction of the regression test cases that this
> changes are actually broken by the patch, in the sense that we meant
> to test the behavior with a CTE and now we're not getting that.
> So we'd need to add MATERIALIZED in many more places than this has
> done. Somebody else (Stephen?) would need to opine on whether that's
> true for the CTEs in rowsecurity.sql, but it's definitely true for
> the one in rowtypes.sql, where the point is to test what happens
> with a whole-row Var.

Agreed, fixed.

> * Which will mean we need some new test cases showing that this patch does
> anything. It'd be a good idea to add a test case showing that this gets
> things right for conflicting CTE names at different levels, eg
>
> explain verbose
> with x as (select 1 as y)
> select * from (with x as (select 2 as y) select * from x) ss;

Added this test case, but more are needed. Any suggestion for what file
these tests belong (right now I just added it to subselect.sql)? Or
should I add a new called cte.sql?

> * ruleutils.c needs adjustments for the new syntax, if we keep that.

Thanks, fixed!

> * And of course the documentation needs much more work than this.

Yeah, I was waiting for there to be more agreement on when CTEs should
be inlined, but maybe I should start writing anyway.

Andreas

Attachment Content-Type Size
inlining-ctes-v6.patch text/x-patch 24.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-01-01 00:38:58 Re: monitoring CREATE INDEX [CONCURRENTLY]
Previous Message Adam Brightwell 2018-12-31 22:23:23 Re: Changing SQL Inlining Behaviour (or...?)