Re: Alias collision in `refresh materialized view concurrently`

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Mathis Rudolf <mathis(dot)rudolf(at)credativ(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Alias collision in `refresh materialized view concurrently`
Date: 2021-08-06 14:48:40
Message-ID: 2488325.1628261320@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> On Wed, Jun 02, 2021 at 03:44:45PM +0530, Bharath Rupireddy wrote:
>> Thanks.The changes with that approach are very minimal. PSA v5 and let
>> me know if any more changes are needed.

> Simple enough, so applied and back-patched.

I just came across this issue while preparing the release notes.
ISTM that people have expended a great deal of effort on a fundamentally
unreliable solution, when a reliable one is easily available.
The originally complained-of issue was that a user-chosen column name
could collide with the query-chosen table name:

ERROR: column reference "mv" is ambiguous
LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...

This is true, but it's self-inflicted damage, because all you have
to do is write the query so that mv is clearly a table name:

... mv.mv AND newdata.* OPERATOR(pg_catalog.*=) mv.*) WHERE ...

AFAICT that works and generates the identical parse tree to the original
coding. The only place touched by the patch where it's a bit difficult to
make the syntax unambiguous this way is

"CREATE TEMP TABLE %s AS "
"SELECT _$mv.ctid AS tid, _$newdata "
"FROM %s _$mv FULL JOIN %s _$newdata ON (",

because newdata.* would ordinarily get expanded to multiple columns
if it's at the top level of a SELECT list, and that's not what we want.
However, that's easily fixed using the same hack as in ruleutils.c's
get_variable: add a no-op cast to the table's rowtype. So this
would look like

appendStringInfo(&querybuf,
"CREATE TEMP TABLE %s AS "
"SELECT mv.ctid AS tid, newdata.*::%s "
"FROM %s mv FULL JOIN %s newdata ON (",
diffname, matviewname, matviewname, tempname);

Given that it took this long to notice the problem at all, maybe
this is not a fix to cram in on the weekend before the release wrap.
But I don't see why we need to settle for "mostly works" when
"always works" is barely any harder.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Himanshu Upadhyaya 2021-08-06 15:23:17 Re: Support reset of Shared objects statistics in "pg_stat_reset" function
Previous Message John Naylor 2021-08-06 14:48:30 Re: RFC: Improve CPU cache locality of syscache searches