Re: logical decoding and replication of sequences

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences
Date: 2021-07-30 18:26:50
Message-ID: 3d6df331-5532-6848-eb45-344b265e0238@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here's a an updated version of this patch - rebased to current master,
and fixing some of the issues raised in Peter's review.

Mainly, this adds a TAP test to src/test/subscription, focusing on
testing the various situations with transactional and non-transactional
behavior (with subtransactions and various ROLLBACK versions).

This new TAP test however uncovered an issue with wait_for_catchup(),
because that uses pg_current_wal_lsn() to wait for replication of all
the changes. But that does not work when the sequence gets advanced in a
transaction that is then aborted, e.g. like this:

BEGIN;
SELECT nextval('s') FROM generate_series(1,100);
ROLLBACK;

The root cause is that pg_current_wal_lsn() uses the LogwrtResult.Write,
which is updated by XLogFlush() - but only in RecordTransactionCommit.
Which makes sense, because only the committed stuff is "visible". But
the non-transactional behavior changes this, because now some of the
changes from aborted transactions may need to be replicated. Which means
the wait_for_catchup() ends up not waiting for the sequence change.

One option would be adding XLogFlush() to RecordTransactionAbort(), but
my guess is we don't do that intentionally (even though aborts should be
fairly rare in most workloads).

I'm not entirely sure changing this (replicating changes from aborted
xacts) is a good idea. Maybe it'd be better to replicate this "lazily" -
instead of replicating the advances right away, we might remember which
sequences were advanced in the transaction, and then replicate current
state for those sequences at commit time.

The idea is that if an increment is "invisible" we probably don't need
to replicate it, it's fine to replicate the next "visible" increment. So
for example given

BEGIN;
SELECT nextval('s');
ROLLBACK;

BEGIN;
SELECT nextval('s');
COMMIT;

we don't need to replicate the first change, but we need to replicate
the second one.

The trouble is we don't decode individual sequence advances, just those
that update the sequence tuple (so every 32 values or so). So we'd need
to remeber the first increment, in a way that is (a) persistent across
restarts and (b) shared by all backends.

The other challenge seems to be ordering of the changes - at the moment
we have no issues with this, because increments on the same sequence are
replicated immediately, in the WAL order. But postponing them to commit
time would affect this order.

I've also briefly looked at the code duplication - there's a couple
functions in the patch that I shamelessly copy-pasted and tweaked to
handle sequences instead of tables:

publicationcmds.c
-----------------

1) OpenTableList/CloseTableList - > OpenSequenceList/CloseSequenceList

Trivial differences, trivial to get rid of - the only difference is
pretty much just table_open vs. relation open.

2) AlterPublicationTables -> AlterPublicationSequences

This is a bit more complicated, because the tables also handle
inheritance (which I think does not apply to sequences). Other than
that, it's calling the functions from (1).

subscriptioncmds.c
------------------

1) fetch_table_list, fetch_sequence_list

Minimal differences, essentially just the catalog name.

2) AlterSubscription_refresh

A lot of duplication, but the code handling tables and sequences is
almost exactly the same and can be reused fairly easily (moved to a
separate function, called for tables and then sequences).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Logical-decoding-replication-of-sequences-20210729.patch text/x-patch 117.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-07-30 18:33:15 Re: archive status ".ready" files may be created too early
Previous Message Tom Lane 2021-07-30 17:58:51 Re: [PATCH] Hooks at XactCommand level