| From: | "Vitaly Davydov" <v(dot)davydov(at)postgrespro(dot)ru> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Newly created replication slot may be invalidated by checkpoint |
| Date: | 2025-11-13 19:58:16 |
| Message-ID: | 1fe72d-69163880-1-1d2f60e0@30346926 |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Re-sending the original message due to a fail...
Hi Zhijie Hou, Amit
> > I think the main issue here lies in the possibility that the minimum restart_lsn
> > obtained during a checkpoint could be less than the WAL position that is being
> > reserved concurrently. So, instead of serializing the redo assignment and WAL
> > reservation, Amit proposes serializing the CheckPointReplicationSlots() and WAL
> > reservation. This would ensure the following:
> >
> > 1) If the WAL reservation occurs first, the checkpoint must wait for the
> > restart_lsn to be updated before proceeding with WAL removal. This guarantees
> > that the most recent restart_lsn position is detected.
> >
> > 2) If the checkpoint calls CheckPointReplicationSlots() first, then any
> > subsequent WAL reservation must take a position later than the redo pointer.
Thank you for the explanation. I agree, the Amit's patch solves the problem
and it is the most promising solution. It is less risky to new bugs and there
is no need to avoid locks for a maximum possible performance. I tried to find
some corner cases but I failed.
FYI, there is another lock-less solution for ReplicationSlotReserveWal with
two-phase reservation that is attached as a diff file. It seems to solve the
redo rec race condition problem but it is not complete and more risky to bugs.
Just share here with the hope that such approach may be interested.
When investigating the solution I come to some questions. Below I shared them.
I do not ask for an answer but I think, they may be considered when preparing
the patch.
1) Looking at ReplicationSlotReserveWal, I'm not sure why we assign restart_lsn
in a loop and exit the loop if XLogGetLastRemovedSegno() < segno is true. I can
understand if we compare restart_lsn with XLogCtl->replicationSlotMinLSN
to handle parallel calls of ReplicationSlotsComputeRequiredLSN as described
in the comment. The old segments removal happens much later in the checkpointer.
I'm not sure, whether the comment describes the case inproperly or the code is
wrong.
2) Why we call XLogSetReplicationSlotMinimumLSN out of the control lock section.
It may lead to some race conditions when two more backends create, advance or
drop slots in parallel. Not sure, the control and allocation locks properly
serialise the updates.
With regards,
Vitaly
| Attachment | Content-Type | Size |
|---|---|---|
| lockless.diff | text/x-patch | 3.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Hannu Krosing | 2025-11-13 20:24:33 | Re: Patch: dumping tables data in multiple chunks in pg_dump |
| Previous Message | Hannu Krosing | 2025-11-13 18:39:26 | Re: Patch: dumping tables data in multiple chunks in pg_dump |