Re: Error "initial slot snapshot too large" in create replication slot

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: jchampion(at)timescale(dot)com, y(dot)sokolov(at)postgrespro(dot)ru, dilipbalaut(at)gmail(dot)com, rjuju123(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Error "initial slot snapshot too large" in create replication slot
Date: 2022-09-12 21:51:56
Message-ID: 20220912215156.zi52zu6n7od3k3v3@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for working on this!

I think this should include a test that fails without this change and succeeds
with it...

On 2022-07-19 11:55:06 +0900, Kyotaro Horiguchi wrote:
> From abcf0a0e0b3e2de9927d8943a3e3c145ab189508 Mon Sep 17 00:00:00 2001
> From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
> Date: Tue, 19 Jul 2022 11:50:29 +0900
> Subject: [PATCH v6] Create correct snapshot during CREATE_REPLICATION_SLOT

This sees a tad misleading - the previous snapshot wasn't borken, right?

> +/*
> + * ReorderBufferXidIsKnownSubXact
> + * Returns true if the xid is a known subtransaction.
> + */
> +bool
> +ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid)
> +{
> + ReorderBufferTXN *txn;
> +
> + txn = ReorderBufferTXNByXid(rb, xid, false,
> + NULL, InvalidXLogRecPtr, false);
> +
> + /* a known subtxn? */
> + if (txn && rbtxn_is_known_subxact(txn))
> + return true;
> +
> + return false;
> +}

The comments here just seem to restate the code....

It's not obvious to me that it's the right design (or even correct) to ask
reorderbuffer about an xid being a subxid. Maybe I'm missing something, but
why would reorderbuffer even be guaranteed to know about all these subxids?

> @@ -568,9 +571,17 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
>
> MyProc->xmin = snap->xmin;
>
> - /* allocate in transaction context */
> + /*
> + * Allocate in transaction context.
> + *
> + * We could use only subxip to store all xids (takenduringrecovery
> + * snapshot) but that causes useless visibility checks later so we hasle to
> + * create a normal snapshot.
> + */

I can't really parse this comment at this point, and I seriously doubt I could
later on.

> @@ -591,12 +605,24 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
>
> if (test == NULL)
> {
> - if (newxcnt >= GetMaxSnapshotXidCount())
> - ereport(ERROR,
> - (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> - errmsg("initial slot snapshot too large")));
> -
> - newxip[newxcnt++] = xid;
> + /* Store the xid to the appropriate xid array */
> + if (ReorderBufferXidIsKnownSubXact(builder->reorder, xid))
> + {
> + if (!overflowed)
> + {
> + if (newsubxcnt >= GetMaxSnapshotSubxidCount())
> + overflowed = true;
> + else
> + newsubxip[newsubxcnt++] = xid;
> + }
> + }
> + else
> + {
> + if (newxcnt >= GetMaxSnapshotXidCount())
> + elog(ERROR,
> + "too many transactions while creating snapshot");
> + newxip[newxcnt++] = xid;
> + }
> }

Hm, this is starting to be pretty deeply nested...

I wonder if a better fix here wouldn't be to allow importing a snapshot with a
larger ->xid array. Yes, we can't do that in CurrentSnapshotData, but IIRC we
need to be in a transactional snapshot anyway, which is copied anyway?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-09-12 21:53:17 Re: Error "initial slot snapshot too large" in create replication slot
Previous Message Thomas Munro 2022-09-12 21:50:54 Re: pgsql: Prefetch data referenced by the WAL, take II.