Re: Excessive LOG messages from replication slot sync worker

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Excessive LOG messages from replication slot sync worker
Date: 2025-08-12 06:54:10
Message-ID: CAJpy0uDYtHGLkcr1sDqKFUm-ef33Kdky9R8qdDC5JgeM7VdWNA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 12, 2025 at 8:52 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
> Hi,
>
> While testing the replication slot synchronization feature, I set up
> three instances on the same machine:
>
> - Physical replication primary (also Logical replication publisher)
> - Physical replication standby
> - Logical replication subscriber
>
> ----------------------------------------
> initdb -D pri --encoding=UTF8 --locale=C
> cat <<EOF >> pri/postgresql.conf
> wal_level = logical
> log_line_prefix = '%m [pri][%b] '
> logging_collector = on
> EOF
> pg_ctl -D pri start
> pgbench -i
> cat <<EOF | psql
> CREATE PUBLICATION mypub FOR ALL TABLES;
> SELECT pg_create_physical_replication_slot('pslot');
> EOF
> cat <<EOF >> pri/postgresql.conf
> synchronized_standby_slots = 'pslot'
> EOF
> pg_ctl -D pri reload
>
> pg_basebackup -D sby -c fast
> touch sby/standby.signal
> cat <<EOF >> sby/postgresql.conf
> port = 5433
> primary_conninfo = 'host=localhost port=5432 dbname=postgres'
> primary_slot_name = 'pslot'
> sync_replication_slots = on
> hot_standby_feedback = on
> log_line_prefix = '%m [sby][%b] '
> EOF
> pg_ctl -D sby start
>
> initdb -D sub --encoding=UTF8 --locale=C
> cat <<EOF >> sub/postgresql.conf
> port = 5434
> wal_level = logical
> log_line_prefix = '%m [sub][%b] '
> logging_collector = on
> EOF
> pg_ctl -D sub start
> pgbench -i -Itp -p 5434
> cat <<EOF | psql -p 5434
> CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost port=5432
> dbname=postgres' PUBLICATION mypub WITH (failover);
> EOF
> ----------------------------------------
>
>
> I noticed that the replication slot sync worker on the standby generates
> a large number of LOG messages, about two per 200–300 ms, during
> pgbench runs (pgbench -c 10 -j 10 -T 60). For example,
>
> ----------------------------------------
> 2025-08-12 11:48:21.067 JST [sby][slotsync worker] LOG: starting
> logical decoding for slot "mysub"
> 2025-08-12 11:48:21.067 JST [sby][slotsync worker] DETAIL: Streaming
> transactions committing after 0/04F317B8, reading WAL from 0/04006FC0.
> 2025-08-12 11:48:21.067 JST [sby][slotsync worker] LOG: logical
> decoding found consistent point at 0/04006FC0
> 2025-08-12 11:48:21.067 JST [sby][slotsync worker] DETAIL: There are
> no running transactions.
> 2025-08-12 11:48:21.296 JST [sby][slotsync worker] LOG: starting
> logical decoding for slot "mysub"
> 2025-08-12 11:48:21.296 JST [sby][slotsync worker] DETAIL: Streaming
> transactions committing after 0/04F9DC90, reading WAL from 0/04006FC0.
> 2025-08-12 11:48:21.297 JST [sby][slotsync worker] LOG: logical
> decoding found consistent point at 0/04006FC0
> 2025-08-12 11:48:21.297 JST [sby][slotsync worker] DETAIL: There are
> no running transactions.
> 2025-08-12 11:48:21.526 JST [sby][slotsync worker] LOG: starting
> logical decoding for slot "mysub"
> 2025-08-12 11:48:21.526 JST [sby][slotsync worker] DETAIL: Streaming
> transactions committing after 0/0506A5C0, reading WAL from 0/04006FC0.
> 2025-08-12 11:48:21.527 JST [sby][slotsync worker] LOG: logical
> decoding found consistent point at 0/04006FC0
> 2025-08-12 11:48:21.527 JST [sby][slotsync worker] DETAIL: There are
> no running transactions.
> 2025-08-12 11:48:21.762 JST [sby][slotsync worker] LOG: starting
> logical decoding for slot "mysub"
> 2025-08-12 11:48:21.762 JST [sby][slotsync worker] DETAIL: Streaming
> transactions committing after 0/050C7F70, reading WAL from 0/04006FC0.
> 2025-08-12 11:48:21.763 JST [sby][slotsync worker] LOG: logical
> decoding found consistent point at 0/04006FC0
> 2025-08-12 11:48:21.763 JST [sby][slotsync worker] DETAIL: There are
> no running transactions.
> 2025-08-12 11:48:21.999 JST [sby][slotsync worker] LOG: starting
> logical decoding for slot "mysub"
> 2025-08-12 11:48:21.999 JST [sby][slotsync worker] DETAIL: Streaming
> transactions committing after 0/05111420, reading WAL from 0/04006FC0.
> 2025-08-12 11:48:21.999 JST [sby][slotsync worker] LOG: logical
> decoding found consistent point at 0/04006FC0
> 2025-08-12 11:48:21.999 JST [sby][slotsync worker] DETAIL: There are
> no running transactions.
> ----------------------------------------
>
>
> Is this expected behavior, a configuration mistake on my side, or perhaps
> an oversight in the commit of the replication slot synchronization feature?
>
> This seems more like debug output left over from development rather than
> something useful for users. If this is true, would it make sense to change
> these messages from LOG to DEBUG?
>

Thank You for reporting the issue.

It looks like commit 2633dae (mistakenly) introduced a change ([1]) in
SnapBuildSnapshotExists(), altering the format used for snapshot file
names during the search. However, SnapBuildSerialize still uses the
old format ("%s/%X-%X.snap"), which led to the slot-sync worker being
unable to locate existing snapshot files. Reverting the format
resolved the issue for me. Could you please check if the attached
patch works for you as well?

[1]:
SnapBuildSnapshotExists()
- sprintf(path, "%s/%X-%X.snap",
+ sprintf(path, "%s/%08X-%08X.snap",

thanks
Shveta

Attachment Content-Type Size
0001-Revert-wrong-change-in-SnapBuildSnapshotExists.patch application/octet-stream 805 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-08-12 06:56:38 Re: Parallel Apply
Previous Message jian he 2025-08-12 06:37:10 Re: implement CAST(expr AS type FORMAT 'template')