Re: PG Docs for ALTER SUBSCRIPTION REFRESH PUBLICATION - copy_data option

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PG Docs for ALTER SUBSCRIPTION REFRESH PUBLICATION - copy_data option
Date: 2021-09-15 03:19:38
Message-ID: CAHut+PvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz=1ryeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 14, 2021 at 8:33 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jun 25, 2021 at 9:20 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > But I recently learned that when there are partitions in the
> > publication, then toggling the value of the PUBLICATION option
> > "publish_via_partition_root" [3] can also *implicitly* change the list
> > published tables, and therefore that too might cause any ASRP to make
> > use of the copy_data value for those implicitly added
> > partitions/tables.
> >
>
> I have tried the below example in this context but didn't see any
> effect on changing via_root option.

Thanks for trying to reproduce. I also thought your steps were the
same as what I'd previously done but it seems like it was a bit
different. Below are my steps to observe some unexpected COPY
happening. Actually, now I am no longer sure if this is just a
documentation issue; perhaps it is a bug.

STEP 1 - create partition tables on both sides
===================================

[PUB and SUB]

postgres=# create table troot (a int) partition by range(a);
CREATE TABLE
postgres=# create table tless10 partition of troot for values from (1) to (9);
CREATE TABLE
postgres=# create table tmore10 partition of troot for values from (10) to (99);
CREATE TABLE

STEP 2 - insert some data on pub-side
==============================

[PUB]

postgres=# insert into troot values (1),(2),(3);
INSERT 0 3
postgres=# insert into troot values (11),(12),(13);
INSERT 0 3

postgres=# select * from troot;
a
----
1
2
3
11
12
13
(6 rows)

STEP 3 - create a publication on the partition root
======================================

[PUB]

postgres=# CREATE PUBLICATION pub1 FOR TABLE troot;
CREATE PUBLICATION
postgres=# \dRp+ pub1;
Publication pub1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
postgres | f | t | t | t | t | f
Tables:
"public.troot"

STEP 4 - create the subscriber
=======================

[SUB]

postgres=# CREATE SUBSCRIPTION sub1 CONNECTION 'host=127.0.0.1
port=5432 dbname=postgres' PUBLICATION pub1;
NOTICE: created replication slot "sub1" on publisher
CREATE SUBSCRIPTION
postgres=# 2021-09-15 12:45:12.224 AEST [30592] LOG: logical
replication apply worker for subscription "sub1" has started
2021-09-15 12:45:12.236 AEST [30595] LOG: logical replication table
synchronization worker for subscription "sub1", table "tless10" has
started
2021-09-15 12:45:12.247 AEST [30598] LOG: logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
started
2021-09-15 12:45:12.326 AEST [30595] LOG: logical replication table
synchronization worker for subscription "sub1", table "tless10" has
finished
2021-09-15 12:45:12.332 AEST [30598] LOG: logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
finished

postgres=# select * from troot;
a
----
1
2
3
11
12
13
(6 rows)

// To this point, everything looks OK...

STEP 5 - toggle the publish_via_partition_root flag
======================================

[PUB]

postgres=# alter publication pub1 set (publish_via_partition_root = true);
ALTER PUBLICATION
postgres=# \dRp+ pub1;
Publication pub1
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
postgres | f | t | t | t | t | t
Tables:
"public.troot"

// And then refresh the subscriber

[SUB]

postgres=# alter subscription sub1 refresh PUBLICATION;
ALTER SUBSCRIPTION
postgres=# 2021-09-15 12:48:37.927 AEST [3861] LOG: logical
replication table synchronization worker for subscription "sub1",
table "troot" has started
2021-09-15 12:48:37.977 AEST [3861] LOG: logical replication table
synchronization worker for subscription "sub1", table "troot" has
finished

// Notice above that another tablesync worker has launched and copied
everything again - BUG??

[SUB]

postgres=# select * from troot;
a
----
1
2
3
1
2
3
11
12
13
11
12
13
(12 rows)

// At this point if I would keep toggling the
publish_via_partition_root then each time I do subscription REFRESH
PUBLICATION it will copy the data yet again. For example,

[PUB]

postgres=# alter publication pub1 set (publish_via_partition_root = false);
ALTER PUBLICATION

[SUB]

postgres=# alter subscription sub1 refresh PUBLICATION;
ALTER SUBSCRIPTION
postgres=# 2021-09-15 12:59:02.106 AEST [21709] LOG: logical
replication table synchronization worker for subscription "sub1",
table "tless10" has started
2021-09-15 12:59:02.120 AEST [21711] LOG: logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
started
2021-09-15 12:59:02.189 AEST [21709] LOG: logical replication table
synchronization worker for subscription "sub1", table "tless10" has
finished
2021-09-15 12:59:02.207 AEST [21711] LOG: logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
finished

By now the pub/sub data on each side is quite different
==========================================

[PUB]

postgres=# select count(*) from troot;
count
-------
6
(1 row)

[SUB]

postgres=# select count(*) from troot;
count
-------
18
(1 row)

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-15 03:34:21 Re: Column Filtering in Logical Replication
Previous Message Michael Paquier 2021-09-15 03:05:21 Re: Estimating HugePages Requirements?