Re: logical decoding and replication of sequences, take 2

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2024-01-23 20:47:24
Message-ID: CA+TgmoY9tCqc_Tg4Bp8zvxvzzzDJbnopvhyVU7CwWnO21Y=Z2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 11, 2024 at 11:27 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> 1) desirability: We want a built-in way to handle sequences in logical
> replication. I think everyone agrees this is not a way to do distributed
> sequences in an active-active setups, but that there are other use cases
> that need this feature - typically upgrades / logical failover.

Yeah. I find it extremely hard to take seriously the idea that this
isn't a valuable feature. How else are you supposed to do a logical
failover without having your entire application break?

> 2) performance: There was concern about the performance impact, and that
> it affects everyone, including those who don't replicate sequences (as
> the overhead is mostly incurred before calls to output plugin etc.).
>
> The agreement was that the best way is to have a CREATE SUBSCRIPTION
> option that would instruct the upstream to decode sequences. By default
> this option is 'off' (because that's the no-overhead case), but it can
> be enabled for each subscription.

Seems reasonable, at least unless and until we come up with something better.

> 3) correctness: The last point is about making "transactional" flag
> correct when the snapshot state changes mid-transaction, originally
> pointed out by Dilip [4]. Per [5] this however happens to work
> correctly, because while we identify the change as 'non-transactional'
> (which is incorrect), we immediately throw it again (so we don't try to
> apply it, which would error-out).

I've said this before, but I still find this really scary. It's
unclear to me that we can simply classify updates as transactional or
non-transactional and expect things to work. If it's possible, I hope
we have a really good explanation somewhere of how and why it's
possible. If we do, can somebody point me to it so I can read it?

To be possibly slightly more clear about my concern, I think the scary
case is where we have transactional and non-transactional things
happening to the same sequence in close temporal proximity, either
within the same session or across two or more sessions. If a
non-transactional change can get reordered ahead of some transactional
change upon which it logically depends, or behind some transactional
change that logically depends on it, then we have trouble. I also
wonder if there are any cases where the same operation is partly
transactional and partly non-transactional.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-01-23 21:02:16 Re: pgsql: Add better handling of redundant IS [NOT] NULL quals
Previous Message Nathan Bossart 2024-01-23 20:25:20 Re: core dumps in auto_prewarm, tests succeed