From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | pg_recvlogical: Prevent flushed data from being re-sent after restarting replication |
Date: | 2025-09-04 16:16:14 |
Message-ID: | CAHGQGwFeTymZQ7RLvMU6WuDGar8bUQCazg=VOfA-9GeBkg-FzA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
When pg_recvlogical loses connection, it reconnects and restarts
replication unless
--no-loop option is used. I noticed that in this scenario, data that
has already been
flushed can be re-sent after restarting replication. This happens
because the replication
start position used when restarting replication is taken from the write position
in the last status update message, which may be older than the actual
position of
the last flushed data. As a result, some flushed data could exist newer than
the replication start position and be re-sent. Is this a bug?
To fix this issue, I'd like to propose the attached patch that fixes
this by ensuring
all written data is flushed to disk before restarting replication and by using
the last flushed position as the replication start point. This prevents already
flushed data from being re-sent.
Additionally, when the --no-loop option is used, I found that pg_recvlogical
could previously exit without flushing written data, risking data loss.
The attached patch fixes this issue by also ensuring that all data is flushed
to disk before exiting with --no-loop.
Thought?
Regards,
--
Fujii Masao
Attachment | Content-Type | Size |
---|---|---|
v1-0001-pg_recvlogical-Prevent-flushed-data-from-being-re.patch | application/octet-stream | 2.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alena Rybakina | 2025-09-04 16:18:21 | Re: Vacuum statistics |
Previous Message | Sami Imseih | 2025-09-04 16:14:53 | PgStat_HashKey padding issue when passed by reference |