Re: pg_replication_origin_drop API potential race condition

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Petr Jelinek <pjmodos(at)pjmodos(dot)net>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_replication_origin_drop API potential race condition
Date: 2021-02-09 03:57:48
Message-ID: CAA4eK1JVcnGnfY8rF8pWXpvsuv960d0KFErdustoEViieFcUUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 8, 2021 at 11:30 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> > +void
> > +replorigin_drop_by_name(char *name, bool missing_ok, bool nowait)
> > +{
> > + RepOriginId roident;
> > + Relation rel;
> > +
> > + Assert(IsTransactionState());
> > +
> > + /*
> > + * To interlock against concurrent drops, we hold ExclusiveLock on
> > + * pg_replication_origin throughout this function.
> > + */
>
> This comment is now wrong though; should s/throughout.*/till xact commit/
> to reflect the new reality.
>

Right, I'll fix in the next version.

> I do wonder if this is going to be painful in some way, since the lock
> is now going to be much longer-lived. My impression is that it's okay,
> since dropping an origin is not a very frequent occurrence. It is going
> to block pg_replication_origin_advance() with *any* origin, which
> acquires RowExclusiveLock on the same relation. If this is a problem,
> then we could use LockSharedObject() in both places (and make it last
> till end of xact for the case of deletion), instead of holding this
> catalog-level lock till end of transaction.
>

IIUC, you are suggesting to use lock for the particular origin instead
of locking the corresponding catalog table in functions
pg_replication_origin_advance and replorigin_drop_by_name. If so, I
don't see any problem with the same but please note that we do take
catalog-level lock in replorigin_create() which would have earlier
prevented create and drop to run concurrently. Having said that, I
don't see any problem with it because I think till the drop is
committed, the create will see the corresponding row as visible and we
won't generate the wrong origin_id. What do you think?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-02-09 04:00:08 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Yugo NAGATA 2021-02-09 03:23:23 Re: Is Recovery actually paused?