Re: logical decoding and replication of sequences, take 2

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2024-02-14 16:51:37
Message-ID: 5bbadb71-b51e-45a1-9acc-d54f5e8850b6@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/13/24 17:37, Robert Haas wrote:
> On Sun, Jan 28, 2024 at 1:07 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> Right, locks + apply in commit order gives us this guarantee (I can't
>> think of a case where it wouldn't be the case).
>
> I couldn't find any cases of inadequate locking other than the one I mentioned.
>
>> Doesn't the whole logical replication critically depend on the commit
>> order? If you decide to arbitrarily reorder/delay the transactions, all
>> kinds of really bad things can happen. That's a generic problem, it
>> applies to all kinds of objects, not just sequences - a parallel apply
>> would need to detect this sort of dependencies (e.g. INSERT + DELETE of
>> the same key), and do something about it.
>
> Yes, but here I'm not just talking about the commit order. I'm talking
> about the order of applying non-transactional operations relative to
> commits.
>
> Consider:
>
> T1: CREATE SEQUENCE s;
> T2: BEGIN;
> T2: SELECT nextval('s');
> T3: SELECT nextval('s');
> T2: ALTER SEQUENCE s INCREMENT 2;
> T2: SELECT nextval('s');
> T2: COMMIT;
>

It's not clear to me if you're talking about nextval() that happens to
generate WAL, or nextval() covered by WAL generated by a previous call.

I'm going to assume it's the former, i.e. nextval() that generated WAL
describing the *next* sequence chunk, because without WAL there's
nothing to apply and therefore no issue with T3 ordering.

The way I think about non-transactional sequence changes is as if they
were tiny transactions that happen "fully" (including commit) at the LSN
where the LSN change is logged.

> The commit order is T1 < T3 < T2, but T3 makes no transactional
> changes, so the commit order is really just T1 < T2. But it's
> completely wrong to say that all we need to do is apply T1 before we
> apply T2. The correct order of application is:
>
> 1. T1.
> 2. T2's first nextval
> 3. T3's nextval
> 4. T2's transactional changes (i.e. the ALTER SEQUENCE INCREMENT and
> the subsequent nextval)
>

Is that quite true? If T3 generated WAL (for the nextval call), it will
be applied at that particular LSN. AFAIK that guarantees it happens
after the first T2 change (which is also non-transactional) and before
the transactional T2 change (because that creates a new relfilenode).

> In other words, the fact that some sequence changes are
> non-transactional creates ordering hazards that don't exist if there
> are no non-transactional changes. So in that way, sequences are
> different from table modifications, where applying the transactions in
> order of commit is all we need to do. Here we need to apply the
> transactions in order of commit and also apply the non-transactional
> changes at the right point in the sequence. Consider the following
> alternative apply sequence:
>
> 1. T1.
> 2. T2's transactional changes (i.e. the ALTER SEQUENCE INCREMENT and
> the subsequent nextval)
> 3. T3's nextval
> 4. T2's first nextval
>
> That's still in commit order. It's also wrong.
>

Yes, this would be wrong. Thankfully the apply is not allowed to reorder
the changes like this, because that's not what "non-transactional" means
in this context.

It does not mean we can arbitrarily reorder the changes, it only means
the changes are applied as if they were independent transactions (but in
the same order as they were executed originally). Both with respect to
the other non-transactional changes, and to "commits" of other stuff.

(for serial apply, at least)

> Imagine that you commit this patch and someone later wants to do
> parallel logical apply. So every time they finish decoding a
> transaction, they stick it in a queue to be applied by the next
> available worker. But, non-transactional changes are very simple, so
> we just directly apply those in the main process. Well, kaboom! But
> now this can happen with the above example.
>
> 1. Decode T1. Add to queue for apply.
> 2. Before the (idle) apply worker has a chance to pull T1 out of the
> queue, decode the first nextval and try to apply it.
>
> Oops. We're trying to apply a modification to a sequence that hasn't
> been created yet. I'm not saying that this kind of hypothetical is a
> reason not to commit the patch. But it seems like we're not on the
> same page about what the ordering requirements are here. I'm just
> making the argument that those non-transactional operations actually
> act like mini-transactions. They need to happen at the right time
> relative to the real transactions. A non-transactional operation needs
> to be applied after any transactions that commit before it is logged,
> and before any transactions that commit after it's logged.
>

How is this issue specific to sequences? AFAIK this is a general problem
with transactions that depend on each other. Consider for example this:

T1: INSERT INTO t (id) VALUES (1);
T2: DELETE FROM t WHERE id = 1;

If you parallelize this in a naive way, maybe T2 gets applied before T1.
In which case the DELETE won't find the row yet.

There's different ways to address this. You can detect this type of
conflicts (e.g. when a DELETE that doesn't find a match), drain the
apply queue and retry the transaction. Or you may compare keysets of the
transactions and make sure the apply waits until the conflicting one
gets fully applied first.

AFAIK for sequences it's not any different, except the key we'd have to
compare is the sequence itself.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-02-14 16:52:32 Re: What about Perl autodie?
Previous Message David Christensen 2024-02-14 16:46:02 Re: Constant Splitting/Refactoring