Re: logical decoding and replication of sequences

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding and replication of sequences
Date: 2022-02-19 02:18:48
Message-ID: e890448d-00b5-18d0-e894-85d17660d484@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/15/22 10:00, Peter Eisentraut wrote:
> On 13.02.22 14:10, Tomas Vondra wrote:
>> Here's the remaining part, rebased, with a small tweak in the TAP test
>> to eliminate the issue with not waiting for sequence increments. I've
>> kept the tweak in a separate patch, so that we can throw it away easily
>> if we happen to resolve the issue.
>
> This basically looks fine to me.  You have identified a few XXX and
> FIXME spots that should be addressed.
>
> Here are a few more comments:
>
> * general
>
> Handling of owned sequences, as previously discussed.  This would
> probably be a localized change somewhere in get_rel_sync_entry(), so it
> doesn't affect the overall structure of the patch.
>

So you're suggesting not to track owned sequences in pg_publication_rel
explicitly, and handle them dynamically in output plugin? So when
calling get_rel_sync_entry on the sequence, we'd check if it's owned by
a table that is replicated.

We'd want a way to enable/disable this for each publication, but that
makes the lookups more complicated. Also, we'd probably need the same
logic elsewhere (e.g. in psql, when listing sequences in a publication).

I'm not sure we want this complexity, maybe we should simply deal with
this in the ALTER PUBLICATION and track all sequences (owned or not) in
pg_publication_rel.

> pg_dump support is missing.
>

Will fix.

> Some psql \dxxx support should probably be there.  Check where existing
> publication-table relationships are displayed.
>

Yeah, I noticed this while adding regression tests. Currently, \dRp+
shows something like this:

Publication testpub_fortbl
Owner | All tables | Inserts | Updates ...
--------------------------+------------+---------+--------- ...
regress_publication_user | f | t | t ...
Tables:
"pub_test.testpub_nopk"
"public.testpub_tbl1"

or

Publication testpub5_forschema
Owner | All tables | Inserts | Updates | ...
--------------------------+------------+---------+---------+- ...
regress_publication_user | f | t | t | ...
Tables from schemas:
"CURRENT_SCHEMA"
"public"

I wonder if we should copy the same block for sequences, so

Publication testpub_fortbl
Owner | All tables | Inserts | Updates ...
--------------------------+------------+---------+--------- ...
regress_publication_user | f | t | t ...
Tables:
"pub_test.testpub_nopk"
"public.testpub_tbl1"
Sequences:
"pub_test.sequence_s1"
"public.sequence_s2"

And same for "tables from schemas" etc.

> * src/backend/catalog/system_views.sql
>
> +         LATERAL pg_get_publication_sequences(P.pubname) GPT,
>
> The GPT presumably stood for "get publication tables", so should be
> changed.
>
> (There might be a few more copy-and-paste occasions like this around.  I
> have not written down all of them.)
>

Will fix.

> * src/backend/commands/publicationcmds.c
>
> This adds a new publication option called "sequence".  I would have
> expected it to be named "sequences", but that's just my opinion.
>
> But in any case, the option is not documented at all.
>
> Some of the new functions added in this file are almost exact duplicates
> of the analogous functions for tables.  For example,
> PublicationAddSequences()/PublicationDropSequences() are almost
> exactly the same as PublicationAddTables()/PublicationDropTables(). This
> could be handled with less duplication by just adding an ObjectType
> argument to the existing functions.
>

Yes, I noticed that too, and I'll review this later, after making sure
the behavior is correct.

> * src/backend/commands/sequence.c
>
> Could use some refactoring of ResetSequence()/ResetSequence2().  Maybe
> call the latter ResetSequenceData() and have the former call it internally.
>

Will check.

> * src/backend/commands/subscriptioncmds.c
>
> Also lots of duplication between tables and sequences in this file.
>

Same as the case above.

> * src/backend/replication/logical/tablesync.c
>
> The comment says it needs sequence support, but there appears to be
> support for initial sequence syncing.  What exactly is missing here?
>

I think that's just obsolete comment.

> * src/test/subscription/t/028_sequences.pl
>
> Change to use done_testing() (see
> 549ec201d6132b7c7ee11ee90a4e02119259ba5b).

Will fix.

Do we need to handle both FOR ALL TABLES and FOR ALL SEQUENCES at the
same time? That seems like a reasonable thing people might want.

The patch probably also needs to modify pg_publication_namespace to
track whether the schema is FOR TABLES IN SCHEMA or FOR SEQUENCES.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2022-02-19 02:41:36 Re: Timeout control within tests
Previous Message Mark Wong 2022-02-19 02:00:28 Re: Time to drop plpython2?