CTE inlining

From: Ilya Shkuratov <motr(dot)ilya(at)ya(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: CTE inlining
Date: 2017-04-29 17:45:00
Message-ID: 5351711493487900@web53g.yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, dear hackers!

There is task in todo list about optional CTE optimization fence
disabling.

I am not interested at this point in disabling mechanism
implementation, but I would like to discuss the optimization
mechanism, that should work when the fence is disabled.

It seems, that we can replace CTE with subquery, so the optimizer
can do all available optimizations. This idea is quite
straightforward, but I could not find a discussion of it.
(Maybe it is so, because everyone knows that the idea is bad and it is
not worth to discuss. But I hope it is not, so I start this thread. =))

First of all, to such replacement to be valid, the CTE must be
1. non-writable (e.g. be of form: SELECT ...),
2. do not use VOLATILE or STABLE functions,
3. ... (maybe there must be more restrictions?)

Also, before inlining, we should check that some optimization
can be applied, using functions from
'pull_up_subqueries_recurse' and 'subquery_push_qual'.

If it is true, and there only one reference to CTE,
we can inline it immediately.

What it is not clear is how we should estimate whether it is worth
to inline, when there is multiple references. Here are my preliminary
ideas.

Let consider "pull up subquery" and "push down qualifiers" cases
separately.

For "push down qualifiers", if `subquery_push_qual` is `true`,
we can do the following:
1. copy CTE subquery,
2. push down quals,
3. find paths,
3. inline if cost of
(CTE scan) > (cheapest_path(subquery) + subquery scan)

Probably, this approach is not feasible, because it involves subquery
replaning, and we should consider a more "lightweight" heuristic.

For "pull up subquery" similar approach may lead to duplicate planning
of the whole query, that almost sure is too expensive.
So I wonder, is it possible to estimate a join predicate selectivity
against CTE subquery result and inline it if selectivity is "high" enough?
(If it is possible the same can be applied to the first case.)

I would be glad to hear feedback on described approach.

Ilya Shkuratov

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2017-04-29 18:18:45 Re: Re: logical replication and PANIC during shutdown checkpoint in publisher
Previous Message Teodor Sigaev 2017-04-29 17:35:05 Re: convert EXSITS to inner join gotcha and bug