Re: Single transaction in the tablesync worker?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Single transaction in the tablesync worker?
Date: 2021-01-13 10:20:41
Message-ID: CAA4eK1KzNbudfwmJD-ureYigX6sNyCU6YgHscg29xWoZG6osvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 13, 2021 at 11:18 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Mon, Jan 4, 2021 at 10:48 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > 7.
> > @@ -905,7 +905,7 @@ replorigin_advance(RepOriginId node,
> > LWLockAcquire(&replication_state->lock, LW_EXCLUSIVE);
> >
> > /* Make sure it's not used by somebody else */
> > - if (replication_state->acquired_by != 0)
> > + if (replication_state->acquired_by != 0 &&
> > replication_state->acquired_by != MyProcPid)
> > {
> >
> > I think you won't need this change if you do replorigin_advance before
> > replorigin_session_setup in your patch.
> >
>
> As you know the replorigin_session_setup sets the
> replication_state->acquired_by to be the current PID. So without this
> change the replorigin_advance rejects that same slot state thinking
> that it is already active for a different process. Root problem is
> that the same process/PID calling both functions would hang.
>

I think the hang happens only if we call unchanged replorigin_advance
after session_setup API, right?

> So this
> patch change allows replorigin_advance code to be called by self.
>
> IIUC that acquired_by check condition is like a sanity check for the
> originid passed. The patched code only does just like what the comment
> says:
> "/* Make sure it's not used by somebody else */"
> Doesn't "somebody else" means "anyone but me" (i.e. anyone but MyProcPid).
>
> Also, “setup” of a thing generally comes before usage of that thing,
> so won't it seem strange to do (like the suggestion) and deliberately
> call the "setup" function 2nd instead of 1st?
>
> Can you please explain why is it better to do it the suggested way
> (switch the calls around) than keep the patch code? Probably there is
> a good reason but I am just not understanding it.
>

Because there is no requirement for origin_advance API to be called
after session setup. Session setup is required to mark the node as
replaying from a remote node, see [1] whereas origin_advance is used
for setting up the initial location or setting a new location, see [2]
(pg_replication_origin_advance).

Now here, after creating the origin, we need to set up the initial
location and it seems fine to call origin_advance before
session_setup. In short, as such, I don't see any problem with your
change in replorigin_advance but OTOH, I don't see the need for the
same as well. So, let's try to avoid that change unless we can't do
without it.

Also, another thing is we need to take RowExclusiveLock on
pg_replication_origin as written in comments atop replorigin_advance
before calling it. See its usage in pg_replication_origin_advance.
Also, write comments on why we need to use replorigin_advance here
(... something, like we need to WAL log this for the purpose of
recovery...).

[1] - https://www.postgresql.org/docs/devel/replication-origins.html
[2] - https://www.postgresql.org/docs/devel/functions-admin.html#FUNCTIONS-REPLICATION

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2021-01-13 10:20:43 Re: support for MERGE
Previous Message Sergey Shinderuk 2021-01-13 10:17:36 Re: pg_preadv() and pg_pwritev()