Re: Alias collision in `refresh materialized view concurrently`

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, Mathis Rudolf <mathis(dot)rudolf(at)credativ(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Alias collision in `refresh materialized view concurrently`
Date: 2021-05-21 10:26:31
Message-ID: CALj2ACVaYYfzTc4jyFTRsnjyub37jVvdbGv=JMxKXGbLz_FcBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 21, 2021 at 6:08 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Thu, May 20, 2021 at 09:14:45PM +0530, Bharath Rupireddy wrote:
> > On Thu, May 20, 2021 at 7:52 PM Bernd Helmle <mailings(at)oopsware(dot)de> wrote:
> >> "mv" looks like a very common alias (i use it all over the time when
> >> testing or playing around with materialized views, so i'm wondering why
> >> i didn't see this issue already myself). So the risk here for such a
> >> collision looks very high. We can try to lower this risk by choosing an
> >> alias name, which is not so common. With a static alias however you get
> >> a static error condition, not something that fails here and then.
> >
> > Another idea is to use random() function to generate required number
> > of uint32 random values(refresh_by_match_merge might need 3 values to
> > replace newdata, newdata2 and mv) and use the names like
> > pg_temp_rmv_<<rand_no1>>, pg_temp_rmv_<<rand_no2>> and so on. This
> > would make the name unguessable. Note that we use this in
> > choose_dsm_implementation, dsm_impl_posix.
>
> I am not sure that I see the point of using a random() number here
> while the backend ID, or just the PID, would easily provide enough
> entropy for this internal alias. I agree that "mv" is a bad choice
> for this alias name. One thing that comes in mind here is to use an
> alias similar to what we do for dropped attributes, say
> ........pg.matview.%d........ where %d is the PID. This will very
> unlikely cause conflicts.

I agree that backend ID and/or PID is enough. I'm not fully convinced
with using random(). To make it more concrete, how about something
like pg.matview.%d.%d (MyBackendId, MyProcPid)? If the user still sees
some collisions, then IMHO, it's better to ensure that this kind of
table/alias names are not generated outside of the server.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-05-21 10:49:29 Re: Parallel Inserts in CREATE TABLE AS
Previous Message Amit Kapila 2021-05-21 10:21:35 Re: "ERROR: deadlock detected" when replicating TRUNCATE