incomplete removal of not referenced CTEs

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: incomplete removal of not referenced CTEs
Date: 2016-09-01 19:36:13
Message-ID: 39c62675-6b37-5482-52c5-aa84c0fbe896@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While investigating a CTE-related query, I've noticed that we don't
really remove all unreachable CTEs. For example, for this query

with a as (select 1),
b as (select * from a),
c as (select * from b)
select 2;

where none of the CTEs if (directly or indirectly) referenced from the
query, we get a plan like this:

QUERY PLAN
---------------------------------------------------------
Result (cost=0.03..0.04 rows=1 width=4)
CTE a
-> Result (cost=0.00..0.01 rows=1 width=4)
CTE b
-> CTE Scan on a (cost=0.00..0.02 rows=1 width=4)
(5 rows)

So we only remove the top-level CTE, but we fail to remove the other
CTEs because we don't tweak the refcount in SS_process_ctes().

Of course, it's harmless as none of those CTEs gets actually executed,
but is this intentional, or do we want/need to fix it? I don't see
anything about this in the docs, but it seems a bit awkward and
confusing to remove only some of the CTEs - I think we should either
remove all or none of them.

I don't think that should be particularly difficult - ISTM we need to
make SS_process_ctes a bit smarter, essentially by adding a loop to
remove the CTEs recursively (and decrease the refcount).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-01 19:41:16 Re: \timing interval
Previous Message Peter van Hardenberg 2016-09-01 19:19:45 Re: \timing interval