Re: [BUG] Unexpected action when publishing partition tables

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: [BUG] Unexpected action when publishing partition tables
Date: 2021-10-18 06:54:27
Message-ID: CAA4eK1JS7dcsrRvPVT36kBu3QtJh0WP_JjN7LfeTU5xJCNLd=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 13, 2021 at 9:11 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> Hi Amit,
>
> On Fri, Oct 8, 2021 at 12:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > On Thu, Oct 7, 2021 at 12:39 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > Sorry that I didn't comment on this earlier, but I think either
> > > GetPubPartitionOptionRelations() or InvalidatePublicationRels()
> > > introduced in the commit 4548c76738b should lock the partitions, just
> > > like to the parent partitioned table would be, before invalidating
> > > them. There may be some hazards to invalidating a relation without
> > > locking it.
> >
> > I see your point but then on the same lines didn't the existing code
> > "for all tables" case (where we call CacheInvalidateRelcacheAll()
> > without locking all relations) have a similar problem.
>
> There might be. I checked to see how other callers/modules use
> CacheInvalidateRelcacheAll(), though it seems that only the functions
> in publicationcmds.c use it or really was invented in 665d1fad99e for
> use by publication commands.
>
> Maybe I need to look harder than I've done for any examples of hazard.
>
> > Also, in your
> > patch, you are assuming that the callers of GetPublicationRelations()
> > will lock all the relations but what when it gets called from
> > AlterPublicationOptions()?
>
> Ah, my bad. I hadn't noticed that one for some reason.
>
> Now that you mention it, I do find it somewhat concerning (on the
> similar grounds as what prompted my previous email) that
> AlterPublicationOptions() does away with any locking on the affected
> relations.
>
> Anyway, I'll think a bit more about the possible hazards of not doing
> the locking and will reply again if there's indeed a problem(s) that
> needs to be fixed.
>

I think you can try to reproduce the problem via the debugger. You can
stop before calling GetPubPartitionOptionRelations in
publication_add_relation() in session-1 and then from another session
(say session-2) try to delete one of the partition table (without
replica identity). Then stop in session-2 somewhere after acquiring
lock to the corresponding partition relation. Now, continue in
session-1 and invalidate the rels and let it complete the command. I
think session-2 will complete the update without processing the
invalidations.

If the above is true, then, this breaks the following behavior
specified in the documentation: "The tables added to a publication
that publishes UPDATE and/or DELETE operations must have REPLICA
IDENTITY defined. Otherwise, those operations will be disallowed on
those tables.". Also, I think such updates won't be replicated on
subscribers as there is no replica identity or primary key column.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-10-18 06:56:15 Re: [BUG] Unexpected action when publishing partition tables
Previous Message Michael Paquier 2021-10-18 06:45:10 Re: try_relation_open and relation_open behave different.