Re: logical decoding and replication of sequences, take 2

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-07-25 11:59:42
Message-ID: 21c87ea8-86c9-80d6-bc78-9b95033ca00b@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/25/23 08:28, Amit Kapila wrote:
> On Mon, Jul 24, 2023 at 9:32 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 7/24/23 12:40, Amit Kapila wrote:
>>> On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat
>>> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>>>
>>> Even after that, see below the value of the sequence is still not
>>> caught up. Later, when the apply worker processes all the WAL, the
>>> sequence state will be caught up.
>>>
>>
>> And how is this different from what tablesync does for tables? For that
>> 'r' also does not mean it's fully caught up, IIRC. What matters is
>> whether the sequence since this moment can go back. And I don't think it
>> can, because that would require replaying changes from before we did
>> copy_sequence ...
>>
>
> For sequences, it is quite possible that we replay WAL from before the
> copy_sequence whereas the same is not true for tables (w.r.t
> copy_table()). This is because for tables we have a kind of interlock
> w.r.t LSN returned via create_slot (say this value of LSN is LSN1),
> basically, the walsender corresponding to tablesync worker in
> publisher won't send any WAL before that LSN whereas the same is not
> true for sequences. Also, even if apply worker can receive WAL before
> copy_table, it won't apply that as that would be behind the LSN1 and
> the same is not true for sequences. So, for tables, we will never go
> back to a state before the copy_table() but for sequences, we can go
> back to a state before copy_sequence().
>

Right. I think the important detail is that during sync we have three
important LSNs

- LSN1 where the slot is created
- LSN2 where the copy happens
- LSN3 where we consider the sync completed

For tables, LSN1 == LSN2, because the data is completed using the
snapshot from the temporary slot. And (LSN1 <= LSN3).

But for sequences, the copy happens after the slot creation, possibly
with (LSN1 < LSN2). And because LSN3 comes from the main subscription
(which may be a bit behind, for whatever reason), it may happen that

(LSN1 < LSN3 < LSN2)

The the sync ends at LSN3, but that means all sequence changes between
LSN3 and LSN2 will be applied "again" making the sequence go away.

IMHO the right fix is to make sure LSN3 >= LSN2 (for sequences).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-07-25 12:19:10 Re: Partition pruning not working on updates
Previous Message David Rowley 2023-07-25 11:37:08 Re: Performance degradation on concurrent COPY into a single relation in PG16.