Re: logical decoding and replication of sequences, take 2

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2024-02-20 12:09:25
Message-ID: aeec3032-03d2-43b8-87b9-da98785bc2f9@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/20/24 06:54, Amit Kapila wrote:
> On Thu, Dec 21, 2023 at 6:47 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 12/19/23 13:54, Christophe Pettus wrote:
>>> Hi,
>>>
>>> I wanted to hop in here on one particular issue:
>>>
>>>> On Dec 12, 2023, at 02:01, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>> - desirability of the feature: Random IDs (UUIDs etc.) are likely a much
>>>> better solution for distributed (esp. active-active) systems. But there
>>>> are important use cases that are likely to keep using regular sequences
>>>> (online upgrades of single-node instances, existing systems, ...).
>>>
>>> +1.
>>>
>>> Right now, the lack of sequence replication is a rather large
>>> foot-gun on logical replication upgrades. Copying the sequences
>>> over during the cutover period is doable, of course, but:
>>>
>>> (a) There's no out-of-the-box tooling that does it, so everyone has
>>> to write some scripts just for that one function.
>>>
>>> (b) It's one more thing that extends the cutover window.
>>>
>>
>> I agree it's an annoying gap for this use case. But if this is the only
>> use cases, maybe a better solution would be to provide such tooling
>> instead of adding it to the logical decoding?
>>
>> It might seem a bit strange if most data is copied by replication
>> directly, while sequences need special handling, ofc.
>>
>
> One difference between the logical replication of tables and sequences
> is that we can guarantee with synchronous_commit (and
> synchronous_standby_names) that after failover transactions data is
> replicated or not whereas for sequences we can't guarantee that
> because of their non-transactional nature. Say, there are two
> transactions T1 and T2, it is possible that T1's entire table data and
> sequence data are committed and replicated but T2's sequence data is
> replicated. So, after failover to logical subscriber in such a case if
> one routes T2 again to the new node as it was not successful
> previously then it would needlessly perform the sequence changes
> again. I don't how much that matters but that would probably be the
> difference between the replication of tables and sequences.
>

I don't quite follow what the problem with synchronous_commit is :-(

For sequences, we log the changes ahead, i.e. even if nextval() did not
write anything into WAL, it's still safe because these changes are
covered by the WAL generated some time ago (up to ~32 values back). And
that's certainly subject to synchronous_commit, right?

There certainly are issues with sequences and syncrep:

https://www.postgresql.org/message-id/712cad46-a9c8-1389-aef8-faf0203c9be9@enterprisedb.com

but that's unrelated to logical replication.

FWIW I don't think we'd re-apply sequence changes needlessly, because
the worker does update the origin after applying non-transactional
changes. So after the replication gets restarted, we'd skip what we
already applied, no?

But maybe there is an issue and I'm just not getting it. Could you maybe
share an example of T1/T2, with a replication restart and what you think
would happen?

> I agree with your point above that for upgrades some tool like
> pg_copysequence where we can provide a way to copy sequence data to
> subscribers from the publisher would suffice the need.
>

Perhaps. Unfortunately it doesn't quite work for failovers, and it's yet
another tool users would need to use.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-02-20 12:22:48 Re: Integer undeflow in fprintf in dsa.c
Previous Message Daniel Gustafsson 2024-02-20 12:00:19 Re: Integer undeflow in fprintf in dsa.c