Re: Error "initial slot snapshot too large" in create replication slot

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: robertmhaas(at)gmail(dot)com
Cc: dilipbalaut(at)gmail(dot)com, stark(dot)cfm(at)gmail(dot)com, michael(at)paquier(dot)xyz, andres(at)anarazel(dot)de, jchampion(at)timescale(dot)com, y(dot)sokolov(at)postgrespro(dot)ru, rjuju123(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Error "initial slot snapshot too large" in create replication slot
Date: 2024-01-12 02:46:55
Message-ID: 20240112.114655.1741765470181903212.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the comments. This will help move the discussion
forward.

At Fri, 5 Jan 2024 15:17:11 -0500, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
> On Thu, Mar 23, 2023 at 11:02 PM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > [ new patch ]
>
> Well, I guess nobody is too excited about fixing this, because it's
> been another 10 months with no discussion. Andres doesn't even seem to
> think this is as much a bug as it is a limitation, for all that it's
> filed in the CF application under bug fixes. I kind of wonder if we
> should just close this entry in the CF, but I'll hold off on that for
> now.

Perhaps you are correct. Ultimately, this issue is largely
theoretical, and I don't believe anyone would be inconvenienced by
imposing this contraint.

> * note: all ids in xip[] satisfy xmin <= xip[i] < xmax
> */
>
> I have to say that I don't like this at all. It's bad enough that we
> already use the xip/subxip arrays in two different ways depending on
> the situation. Increasing that to three different ways seems painful.
> How is anyone supposed to keep track of how the array is being used at
> which point in the code?

I understand. So, summirizing the current state briefly, I believe it
as follows:

a. snapbuild lacks a means to differentiate between top and sub xids
during snapshot building.

b. Abusing takenDuringRecovery could lead to potential issues, so it
has been rejected.

c. Dynamic sizing of xip is likely to have a significant impact on
performance (as mentioned in the comments of GetSnapshotData).

d. (new!) Adding a third storing method is not favored.

As a method to satisfy these prerequisites, I think one direction
could be to split takenDuringRecovery into flags indicating the
storage method and creation timing. I present this as a last-ditch
effort. It's a rough proposal, and further validation should be
necessary. If this direction is also not acceptable, I'll proceed with
removing the CF entry.

> If we are going to do that, I suspect it needs comment updates in more
> places than what the patch does currently. For instance:
>
> + if (newxcnt < xiplen)
> + newxip[newxcnt++] = xid;
> + else
> + newsubxip[newsubxcnt++] = xid;
>
> Just imagine coming across this code in 5 or 10 years and finding that
> it had no comment explaining anything. Yikes!

^^;

> Aside from the details of the patch, and perhaps more seriously, I'm
> not really clear that we have consensus on an approach. A few
> different proposals seem to have been floated, and it doesn't seem
> like anybody hates anybody else's idea completely, but it doesn't
> really seem like everyone agrees on what to do, either.

I don't fully agree with that.It's not so much that I dislike other
proposals, but rather that we haven't been able to find a definitive
solution that stands out.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v9-0001-Lift-initial-snapshot-limit-for-logical-replicati.patch text/x-patch 8.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-01-12 02:49:39 Re: Improve the connection failure error messages
Previous Message Melanie Plageman 2024-01-12 02:04:59 Re: Emit fewer vacuum records by reaping removable tuples during pruning