Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-05-08 15:41:26
Message-ID: CAGPVpCRzD-ZZEc9ienhyrVpCzd9AJ7fxE--OFFJBnBg3E0438w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Melanie,

Thanks for reviewing.

Melanie Plageman <melanieplageman(at)gmail(dot)com>, 27 Şub 2023 Pzt, 03:10
tarihinde şunu yazdı:
>
> I notice that you've added a new user-facing option to make a snapshot.
> I think functionality to independently make a snapshot for use elsewhere
> has been discussed in the past for the implementation of different
> features (e.g. [1] pg_dump but they ended up using replication slots for
> this I think?), but I'm not quite sure I understand all the implications
> for providing a user-visible create snapshot command. Where can it be
> used? When can the snapshot be used? In your patch's case, you know that
> you can use the snapshot you are creating, but I just wonder if any
> restrictions or caveats need be taken for its general use.

I can't say a use-case, other than this patch, that needs this user-facing
command. The main reason why I added this command as it is in the patch is
because that's already how other required communication between publisher
and subscriber is done for other operations in logical replication. Even
though it may sound similar to the case in pg_dump discussion, I think the
main difference is that calling CREATE_REPLICATION_SNAPSHOT creates a
snapshot and imports it to wherever it's called (i.e. the same transaction
which invoked CREATE_REPLICATION_SNAPSHOT ), and not used anywhere else.
But I agree that this part of the patch needs more thoughts and reviews.
Honestly, I'm not also sure if this is the ideal way to fix the "snapshot
issue" introduced by reusing the same replication slot.

>
> For the worker reuse portion of the code, could it be a separate patch
> in the set? It could be independently committable and would be easier to
> review (separate from repl slot reuse).

I did this, please see the patch 0001.

>
> You mainly have this structure, but it is a bit hidden and some of the
> shared functions that previously may have made sense for table sync
> worker and apply workers to share don't really make sense to share
> anymore.
>
> The main thing that table sync workers and apply workers share is the
> logic in LogicalRepApplyLoop() (table sync workers use when they do
> catchup), so perhaps we should make the other code separate?

You're right that apply and tablesync worker's paths are unnecessarily
intertwined. With the reusing workers/replication slots logic, I guess it
became worse.
I tried to change the structure to something similar to what you explained.
Tablesync workers have different starting point now and it simply runs as
follows:

TableSyncWorkerMain()
loop:
start_table_sync()
walrcv_startstreaming()
LogicalRepApplyLoop()
check if there is a table with INIT state
if there is such table: // reuse case
clean_sync_worker()
else: // exit case
walrcv_endstreaming()
ReplicationSlotDropAtPubNode()
replorigin_drop_by_name
break
proc_exit()

> It seems it would be common for workers to be looking through the
> subscription rel states at the same time, and I don't really see how you
> prevent races in who is claiming a relation to work on. Though you take
> a shared lock on the LogicalRepWorkerLock, what if in between
> logicalrep_worker_find() and updating my MyLogicalRepWorker->relid,
> someone else also updates their relid to that relid. I don't think you
> can update LogicalRepWorker->relid with only a shared lock.
>
>
> I wonder if it is not better to have the apply leader, in
> process_syncing_tables_for_apply(), first check for an existing worker
> for the rel, then check for an available worker without an assignment,
> then launch a worker?
>
> Workers could then sleep after finishing their assignment and wait for
> the leader to give them a new assignment.

I'm not sure if we should rely on a single apply worker for the assignment
of several tablesync workers. I suspect that moving the assignment
responsibility to the apply worker may bring some overhead. But I agree
that shared lock on LogicalRepWorkerLock is not good. Changed it to
exclusive lock.

>
> Given an exclusive lock on LogicalRepWorkerLock, it may be okay for
> workers to find their own table assignments from the subscriptionrel --
> and perhaps this will be much more efficient from a CPU perspective. It
> feels just a bit weird to have the code doing that buried in
> process_syncing_tables_for_sync(). It seems like it should at least
> return out to a main table sync worker loop in which workers loop
> through finding a table and assigning it to themselves, syncing the
> table, and catching the table up.

Right, it shouldn't be process_syncing_tables_for_sync()'s responsibility.
I moved it into the TableSyncWorkerMain loop.

Also;
I did some benchmarking like I did a couple of times previously [1].
Here are the recent numbers:

With empty tables:
+--------+------------+-------------+--------------+
| | 10 tables | 100 tables | 1000 tables |
+--------+------------+-------------+--------------+
| master | 296.689 ms | 2579.154 ms | 41753.043 ms |
+--------+------------+-------------+--------------+
| patch | 210.580 ms | 1724.230 ms | 36247.061 ms |
+--------+------------+-------------+--------------+

With 10 tables loaded with some data:
+--------+------------+-------------+--------------+
| | 1 MB | 10 MB | 100 MB |
+--------+------------+-------------+--------------+
| master | 568.072 ms | 2074.557 ms | 16995.399 ms |
+--------+------------+-------------+--------------+
| patch | 470.700 ms | 1923.386 ms | 16980.686 ms |
+--------+------------+-------------+--------------+

It seems that even though master has improved since the last time I did a
similar experiment, the patch still improves the time spent in table sync
for empty/small tables.
Also there is a decrease in the performance of the patch, compared with the
previous results [1]. Some portion of it might be caused by switching from
shared locks to exclusive locks. I'll look into that a bit more though.

[1]
https://www.postgresql.org/message-id/CAGPVpCQdZ_oj-QFcTOhTrUTs-NCKrrZ%3DZNCNPR1qe27rXV-iYw%40mail.gmail.com

Best,
--
Melih Mutlu
Microsoft

Attachment Content-Type Size
0001-Reuse-Tablesync-Workers.patch application/octet-stream 28.3 KB
v9-0002-Add-replication-protocol-cmd-to-create-a-snapshot.patch application/octet-stream 21.0 KB
v12-0003-Reuse-Replication-Slot-and-Origin-in-Tablesync.patch application/octet-stream 59.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-05-08 15:48:09 Re: Add standard collation UNICODE
Previous Message Peter Eisentraut 2023-05-08 15:22:28 Re: Improve list manipulation in several places