Re: Physical replication slot advance is not persistent

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, simon(at)2ndquadrant(dot)com
Subject: Re: Physical replication slot advance is not persistent
Date: 2020-01-21 05:07:30
Message-ID: 20200121050730.GF2552@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 20, 2020 at 11:00:14AM -0800, Andres Freund wrote:
> On 2020-01-20 15:45:40 +0900, Michael Paquier wrote:
>> 1) The slot advancing has to mark the slot as dirty, but should we
>> make the change persistent at the end of the function or should we
>> wait for a checkpoint to do the work, meaning that any update done to
>> the slot would be lost if a crash occurs in-between? Note that we
>> have this commit in slotfuncs.c for
>> pg_logical_replication_slot_advance():
>> * Dirty the slot so it's written out at the next checkpoint.
>> * We'll still lose its position on crash, as documented, but it's
>> * better than always losing the position even on clean restart.
>>
>> This comment refers to the documentation for the logical decoding
>> section (see logicaldecoding-replication-slots in
>> logicaldecoding.sgml), and even if nothing can be done until the slot
>> advance function reaches its hand, we ought to make the data
>> persistent if we can.
>
> That doesn't really seem like a meaningful reference, because the
> concerns between constantly streaming out changes (where we don't want
> to fsync every single transaction), and doing so in a manual advance
> through an sql function, seem different.

No disagreement with that, still it is the only reference we have in
the docs about that. I think that we should take the occasion to
update the docs of the advancing functions accordingly with what we
think is the best choice; should the slot information be flushed at
the end of the function, or at the follow-up checkpoint?

>> 3) The amount of testing related to slot advancing could be better
>> with cluster-wide operations.
>>
>> @@ -370,6 +370,11 @@ pg_physical_replication_slot_advance(XLogRecPtr
>> moveto)
>> MyReplicationSlot->data.restart_lsn = moveto;
>>
>> SpinLockRelease(&MyReplicationSlot->mutex);
>> retlsn = moveto;
>> +
>> + ReplicationSlotMarkDirty();
>> +
>> + /* We moved retart_lsn, update the global value. */
>> + ReplicationSlotsComputeRequiredLSN();
>> I think that the proposed patch is missing a call to
>> ReplicationSlotsComputeRequiredXmin() here for physical slots.
>
> Hm. It seems ok to include, but I don't think omitting it currently has
> negative effects?

effective_xmin can be used by WAL senders with physical slots. It
seems safer in the long term to include it, IMO.

>> static XLogRecPtr
>> -pg_physical_replication_slot_advance(XLogRecPtr moveto)
>> +pg_physical_replication_slot_advance(XLogRecPtr moveto, bool *advance_done)
>> {
>> XLogRecPtr startlsn = MyReplicationSlot->data.restart_lsn;
>> XLogRecPtr retlsn = startlsn;
>>
>> + *advance_done = false;
>> +
>> if (startlsn < moveto)
>> {
>> SpinLockAcquire(&MyReplicationSlot->mutex);
>> MyReplicationSlot->data.restart_lsn = moveto;
>> SpinLockRelease(&MyReplicationSlot->mutex);
>> retlsn = moveto;
>> + *advance_done = true;
>> }
>>
>> return retlsn;
>
> Hm. Why did you choose not to use endlsn as before (except being
> broken), or something?

When doing repetitive calls of the advancing functions, the advancing
happens in the first call, and the next ones do nothing, so if no
updates is done there is no meaning to flush the slot information.

> It seems quite conceivable somebody is using these functions in an
> extension.

Not sure I get that, pg_physical_replication_slot_advance and
pg_logical_replication_slot_advance are static in slotfuncs.c.

> Hm. I'm think testing this with real LSNs is a better idea. What if the
> node actually already is past FF/FFFFFFFF at this point? Quite unlikely,
> I know, but still. I.e. why not get the current LSN after the INSERT,
> and assert that the slot, after the restart, is that?

Sure. If not disabling autovacuum in the test, we'd just need to make
sure if that advancing is at least ahead of the INSERT position.

Anyway, I am still not sure if we should got down the road to just
mark the slot as dirty if any advancing is done and let the follow-up
checkpoint to the work, if the advancing function should do the slot
flush, or if we choose one and make the other an optional choice (not
for back-branches, obviously. Based on my reading of the code, my
guess is that a flush should happen at the end of the advancing
function.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2020-01-21 05:16:29 Re: Option to dump foreign data in pg_dump
Previous Message Michael Paquier 2020-01-21 04:49:21 Re: pg13 PGDLLIMPORT list