Re: logical decoding and replication of sequences

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding and replication of sequences
Date: 2022-02-22 23:24:30
Message-ID: 9defb937-20a9-1d9f-6972-f1c4b4da4f73@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here's an updated version of the patch, fixing most of the issues from
reviews so far. There's still a couple FIXME comments, but I think those
are minor and/or straightforward to deal with.

The main improvements:

1) Elimination of a lot of redundant code - one function handling
tables, and an almost exact copy handling sequences. Now a single
function handles both, possibly with "sequences" flag to tweak the behavior.

2) A couple other functions gained "sequences" flag, determining which
objects are "interesting". For example PublicationAddSchemas needs to
know whether it's FOR ALL SEQUENCES or FOR ALL TABLES IN SCHEMA. I don't
think we can just use relkind here easily, because for tables we need to
handle various types of tables (regular, partitioned, ...).

3) I also renamed a couple functions with "tables" in the name, which
are now used for sequences too. So for example OpenTablesList() is
renamed to OpenRelationList() and so on.

4) Addition of a number of regression tests to "publication.sql", which
showed a lot of issues, mostly related to not distinguishing tables and
sequences when handling "FOR ALL TABLES [IN SCHEMA]" and "FOR ALL
SEQUENCES [IN SCHEMA]".

5) Proper tracking of FOR ALL [TABLES|SEQUENCES] IN SCHEMA in a catalog.
The pg_publication_namespace gained a pnsequences flag, which determines
which case it is. So for example if you do

ALTER PUBLICATION p ADD ALL TABLES IN SCHEMA s;
ALTER PUBLICATION p ADD ALL SEQUENCES IN SCHEMA s;

there will be two rows in the catalog, one with 't' and one with 'f' in
the new column. I'm not sure this is the best way to track this - maybe
it'd be better to have two flags, and keep a single row. Or maybe we
should have an array of relkinds (but that has the issue with tables
having multiple relkinds mentioned before). Multiple rows make it more
convenient to add/remove publication schemas - with a single row it'd be
necessary to either insert a new row or update an existing one when
adding the schema, and similarly for dropping it.

But maybe there are reasons / precedent to design this differently ...

6) I'm not entirely sure the object_address changes (handling of the
pnsequences flag) are correct.

7) This updates psql to do roughly the same thing as for tables, so \dRp
now list sequences added either directly or through schema, so you might
get footer like this:

\dRp+ testpub_mix
...
Tables:
"public.testpub_tbl1"
Tables from schemas:
"pub_test"
Sequences:
"public.testpub_seq1"
Sequences from schemas:
"pub_test"

Maybe it's a bit too verbose, though. It also addes "All sequences" and
"Sequences" columns into the publication description, but I don't think
that can be done much differently.

FWIW I had to switch the describeOneTableDetails() chunk dealing with
sequences from printQuery() to printTable() in order to handle dynamic
footers.

There's also a change in \dn+ because a schema may be included in one
publication as "FOR ALL SEQUENCES IN SCHEMA" and in another publication
with "FOR ALL TABLES IN SCHEMA". So I modified the output to

\dn+ pub_test1
...
Publications:
"testpub_schemas" (sequences)
"testpub_schemas" (tables)

But maybe it'd be better to aggregate this into a single line like

\dn+ pub_test1
...
Publications:
"testpub_schemas" (tables, sequences)

Opinions?

8) There's a shortcoming in the current grammar, because you can specify
either

CREATE PUBLICATION p FOR ALL TABLES;

or

CREATE PUBLICATION p FOR ALL SEQUENCES;

but it's not possible to do

CREATE PUBLICATION p FOR ALL TABLES AND FOR ALL SEQUENCES;

which seems like a fairly reasonable thing users might want to do.

The problem is that "FOR ALL TABLES" (and same for sequences) is
hard-coded in the grammar, not defined as PublicationObjSpec. This also
means you can't do

ALTER PUBLICATION p ADD ALL TABLES;

AFAICS there are two ways to fix this - adding the combinations into the
definition of CreatePublicationStmt, or adding FOR ALL TABLES (and
sequences) to PublicationObjSpec.

9) Another grammar limitation is that we don't cross-check the relkind,
so for example

ALTER PUBLICATION p ADD TABLE sequence;

might actually work. Should be easy to fix, though.

10) Added pg_dump support (including tests). I'll add more tests, to
check more grammar combinations.

11) I need to test more grammar combinations in the TAP test too, to
verify the output plugin interprets the stuff correctly.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Add-support-for-decoding-sequences-to-built-20220222.patch text/x-patch 220.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-02-22 23:39:00 Re: logical decoding and replication of sequences
Previous Message Tom Lane 2022-02-22 23:23:19 Re: Frontend error logging style