Re: [Patch] add new parameter to pg_replication_origin_session_setup

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Doruk Yilmaz <doruk(at)mixrank(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [Patch] add new parameter to pg_replication_origin_session_setup
Date: 2026-02-04 06:02:29
Message-ID: CAJpy0uD5T692uyaRr0GKqd1eLHW55A0=MH-V8tRPicTxDqMu=g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 4, 2026 at 12:38 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> The new error message is not great:
>
> postgres=# select pg_replication_origin_session_setup('myorigin', 12345678);
> ERROR: could not find replication state slot for replication origin
> with OID 1 which was acquired by 12345678
>
> Firstly, replication origin is not an OID. Secondly, it's a little
> confusing because the "replication state slot" is in fact present.
> However, it's currently inactive, i.e. not "acquired" by the given PID.
>
> I propose to change that to:
>
> postgres=# select pg_replication_origin_session_setup('myorigin', 12345678);
> ERROR: replication origin with ID 1 is not active for PID 12345678
>
> That's more in line with this neighboring message:
>
> ERROR: replication origin with ID 1 is already active for PID 701228

Thank You for your suggestions here. The suggested error message looks
better. Thus patch001 LGTM.

>
> I also wonder if the error code is appropriate. That error uses
> ERRCODE_OBJECT_IN_USE, but if the problem is that the origin is
> currently *not* active, that seems backwards. I didn't change that in
> the attached patch, but it's something to think about.

I agree that ERRCODE_OBJECT_IN_USE doesn’t seem like the best fit in
below code. IMO, ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE would be
more suitable.But let's hear from others.

if (curstate->acquired_by != acquired_by)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with ID %d is not active for PID %d",
curstate->roident, acquired_by)));
}

>
> The second patch rearranges the if-else statements to check those
> conditions. I found the current logic hard to follow, this makes them
> feel more natural, in my opinion at least.

I’m not fully convinced that it makes the code clearer or better to
understand. For example, consider the following error case:

else if (curstate->acquired_by == 0 && curstate->refcount > 0)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
errmsg("replication origin with ID %d is already active
in another process",
curstate->roident)));

The condition was self explanatory earlier. The condition has now been
rewritten (or reduced) to:
if (acquired_by == 0 && curstate->refcount > 0)

However, this is not exactly the same as the earlier logic. On closer
inspection, the condition 'curstate->acquired_by == 0', while no
longer stated explicitly, was implicitly guaranteed by the preceding
error block:

if (curstate->acquired_by != 0)
ERROR;

One way to make this clearer would be to structure the logic as:

if (curstate->acquired_by != 0)
ERROR;
else if (curstate->refcount > 0)
ERROR;

Even then, it is still not clear why this error (the case where
curstate->refcount > 0 and curstate->acquired_by == 0) is raised only
when acquired_by == 0, or how the same situation is expected to be
handled when acquired_by != 0. The earlier code was clearer in this
regard (at least to me), as it first handled the
'curstate->acquired_by != acquired_by' case, making the overall logic
somehow easier to understand. Perhaps we can try by adding a comment
or restructure in some other way if possible and needed?

> It has one user-visible
> effect: If you call the function with acquired_pid != 0 and the origin
> has no state slot, *and* there are no free slots, you previously got
> this error:
>
> postgres=# select pg_replication_origin_session_setup('other', 123);
> ERROR: could not find free replication state slot for replication
> origin with ID 2
> HINT: Increase "max_active_replication_origins" and try again.
>
> Now you get this:
>
> postgres=# select pg_replication_origin_session_setup('other', 123);
> ERROR: cannot use PID 123 for inactive replication origin with ID 2
>
> Both error messages are more or less appropriate in that situation, but
> I think the new behavior is slightly better. The fact that the origin is
> inactive feels like the bigger problem here.

I am okay with this change though. Let's see what others have to say.

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2026-02-04 06:13:05 Re: Add expressions to pg_restore_extended_stats()
Previous Message Michael Paquier 2026-02-04 05:56:39 Re: Add expressions to pg_restore_extended_stats()