From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, tomas(at)vondra(dot)me |
Subject: | Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly |
Date: | 2025-06-10 20:14:16 |
Message-ID: | CAPpHfdt77k3BqD=anh4b7UiRevfPOu2Hcf3rvSZXntxH+4G=Ug@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 9, 2025 at 7:09 PM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
> > I think we can use this approach for HEAD and probably keep the
> > previous idea for backbranches. Keeping some value in shared_memory
> > per slot sounds risky to me in terms of introducing new bugs.
>
> Not sure, what kind of problems may occur. I propose to allocate in shmem an
> array of last_saved_restart_lsn like below which is not a part of the public
> api (see below). It will be allocated and deallocated in shmem the same way as
> ReplicationSlotCtlData. I can prepare a patch, if needed.
>
> typedef struct ReplicationSlotCtlDataExt {
> XLogRecPtr last_saved_restart_lsn[1];
> } ReplicationSlotCtlDataExt;
This could work, but I think this is not a solution for HEAD anyway.
In the HEAD, it would be better to keep everything inside the
ReplicationSlot struct. In the same time, I don't like idea to have
different shared memory structs between branches if we can avoid that.
> > Yeah, but with physical slots, it is possible that the slot's xmin
> > value is pointing to some value, say 700 (after restart), but vacuum
> > would have removed tuples from transaction IDs greater than 700 as
> > explained in email [1].
>
> I think, we have no xmin problem for physical slots. The xmin values of
> physical slots are used to process HSF messages. If I correctly understood what
> you mean, you are telling about the problem which is solved by hot standby
> feedback messages. This message is used to disable tuples vacuuming on the
> primary to avoid delete conflicts on the replica in queries (some queries may
> select some tuples which were vacuumed on the primary and deletions are
> replicated to the standby). If the primary receives a HSF message after slot
> saving, I believe, it is allowable if autovacuum cleans tuples with xmin later
> than the last saved value. If the primary restarts, the older value will be
> loaded but the replica already confirmed the newer value. Concerning replica,
> it is the obligation of the replica to send such HSF xmin that will survive
> replica's immediate restart.
+1
> >> Taking into account these thoughts, I can't see any problems with the alternative
> >> patch where oldest wal lsn is calculated only in checkpoint.
> >>
> >The alternative will needlessly prevent removing WAL segments in some
> >cases when logical slots are in use.
>
> IMHO, I'm not sure, it will significantly impact the wal removal. We remove WAL
> segments only in checkpoint. The alternate solution gets the oldest WAL segment
> at the beginning of checkpoint, then saves dirty slots to disk, and removes old
> WAL segments at the end of checkpoint using the oldest WAL segment obtained at
> the beginning of checkpoint. The alternate solution may not be so effective
> in terms of WAL segments removal, if a logical slot is advanced during
> checkpoint, but I do not think it is a significant issue. From the other hand,
> the alternate solution simplifies the logic of WAL removal, backward compatible
> (avoids addition new in-memory states), decreases the number of locks in
> ReplicationSlotsComputeRequiredLSN - no need to recalculate oldest slots'
> restart lsn every time when a slot is advanced.
So, my proposal is to commit the attached patchset to the HEAD, and
commit [1] to the back branches. Any objections?
------
Regards,
Alexander Korotkov
Supabase
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Keep-WAL-segments-by-slot-s-flushed-restart-LSN.patch | application/octet-stream | 7.0 KB |
v3-0002-Add-TAP-tests-to-check-replication-slot-advance-d.patch | application/octet-stream | 13.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2025-06-10 20:21:34 | Re: add function for creating/attaching hash table in DSM registry |
Previous Message | Andres Freund | 2025-06-10 20:09:53 | Re: Non-reproducible AIO failure |