RE: Synchronizing slots from primary to standby

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "'Drouvot, Bertrand'" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject: RE: Synchronizing slots from primary to standby
Date: 2023-06-29 10:22:06
Message-ID: TYAPR01MB58660162840428B320AE4087F525A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Drouvot,

Hi, I'm also interested in the feature. Followings are my high-level comments.
I did not mention some detailed notations because this patch is not at the stage.
And very sorry that I could not follow all of this discussions.

1. I thought that we should not reuse logical replication launcher for another purpose.
The background worker should have only one task. I wanted to ask opinions some other people...
2. I want to confirm the reason why new replication command is added. IIUC the
launcher connects to primary by using primary_conninfo connection string, but
it establishes the physical replication connection so that any SQL cannot be executed.
Is it right? Another approach not to use is to specify the target database via
GUC, whereas not smart. How do you think?
3. You chose the per-db worker approach, however, it is difficult to extend the
feature to support physical slots. This may be problematic. Was there any
reasons for that? I doubted ReplicationSlotCreate() or advance functions might
not be used from other databases and these may be reasons, but not sure.
If these operations can do without connecting to specific database, I think
the architecture can be changed.
4. Currently the launcher establishes the connection every time. Isn't it better
to reuse the same one instead?

Following comments are assumed the configuration, maybe the straightfoward:

primary->standby
|->subscriber

5. After constructing the system, I dropped the subscription on the subscriber.
In this case the logical slot on primary was removed, but that was not replicated
to standby server. Did you support the workload or not?

```
$ psql -U postgres -p $port_sub -c "DROP SUBSCRIPTION sub"
NOTICE: dropped replication slot "sub" on publisher
DROP SUBSCRIPTION

$ psql -U postgres -p $port_primary -c "SELECT * FROM pg_replication_slots"
slot_name | plugin | slot_type | datoid | database |...
-----------+----------+-----------+--------+----------+...
(0 rows)

$ psql -U postgres -p $port_standby -c "SELECT * FROM pg_replication_slots"
slot_name | plugin | slot_type | datoid | database |...
-----------+----------+-----------+--------+----------+...
sub | pgoutput | logical | 5 | postgres |...
(1 row)

```

6. Current approach may delay the startpoint of sync.

Assuming that physical replication system is created first, and then the
subscriber connects to the publisher node. In this case the launcher connects to
primary earlier than the apply worker, and reads the slot. At that time there are
no slots on primary, so launcher disconnects from primary and waits a time period (up to 3min).
Even if the apply worker creates the slot on publisher, but the launcher on standby
cannot notice that. The synchronization may start 3 min later.

I'm not sure how to fix or it could be acceptable. Thought?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-06-29 10:36:39 Re: Synchronizing slots from primary to standby
Previous Message jian he 2023-06-29 10:20:32 Re: Incremental View Maintenance, take 2