RE: Bug in query rewriter - hasModifyingCTE not getting set

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Bug in query rewriter - hasModifyingCTE not getting set
Date: 2021-05-21 06:41:57
Message-ID: TYAPR01MB2990E8BB8D4D97EACB7FCBC7FE299@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com> writes:
> > Finally, I think I've understood what you meant. Yes, the current code seems
> to be wrong.
>
> I'm fairly skeptical of this claim, because that code has stood for a
> long time. Can you provide an example (not involving hasModifyingCTE)
> in which it's wrong?

Hmm, I don't think of an example. I wonder if attaching WITH before INSERT SELECT and putting WITH between INSERT and SELECT produce the same results. Maybe that's why the regression test succeeds with the patch.

To confirm, the question is that when we have the following rule in place and the client issues the query:

[rule]
CREATE RULE myrule AS
ON {INSERT | UPDATE | DELETE} TO orig_table
DO INSTEAD
INSERT INTO some_table SELECT ...;

[original query]
WITH t AS (
SELECT and/or NOTIFY
)
{INSERT INTO | UPDATE | DELETE FROM} orig_table ...;

which of the following two queries do we expect?

[generated query 1]
WITH t AS (
SELECT and/or NOTIFY
)
INSERT INTO some_table SELECT ...;

[generated query 2]
INSERT INTO some_table
WITH t AS (
SELECT and/or NOTIFY
)
SELECT ...;

Although both may produce the same results, I naturally expected query 1, because WITH was originally attached before the top-level query, and (2) the top-level query has been replaced with a rule action, so it's natural that the WITH is attached before the rule action. Super-abbreviated description is:

x -> y (rule)
WITH t x (original query)
WITH t y (generated query 1)
one-part-of-y WITH t another-part-of-y (generated query 2)

As we said, we agree to fail the query if it's the above generated query 2 and WITH contains a data-modyfing CTE, if we cannot be confident to accept the change to the WITH position. Which do you think we want to choose?

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jürgen Purtz 2021-05-21 06:47:15 Re: Additional Chapter for Tutorial
Previous Message Michael Paquier 2021-05-21 06:32:05 Re: Move pg_attribute.attcompression to earlier in struct for reduced size?