pg_upgrade: optimize replication slot caught-up check

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: pg_upgrade: optimize replication slot caught-up check
Date: 2026-01-05 18:02:34
Message-ID: CAD21AoBZ0LAcw1OHGEKdW7S5TRJaURdhEk3CLAW69_siqfqyAg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
slots. Currently, to check if the slots are ready to be migrated, we
call binary_upgrade_logical_slot_has_caught_up() for every single
slot. This checks if there are any unconsumed WAL records. However, we
noticed a performance issue. If there are many slots (e.g., 100 or
more) or if there is a WAL backlog, checking all slots one by one
takes a long time.

Here are some test results from my environment:
With an empty cluster: 1.55s
With 200 slots and 30MB backlog: 15.51s

Commit 6d3d2e8e5 introduced parallel checks per database, but a single
job might still have to check too many slots, causing delays.

Since binary_upgrade_logical_slot_has_caught_up() essentially checks
if any decodable record exists in the database, IIUC it is not
necessary to check every slot. We can optimize this by checking only
the slot with the minimum confirmed_flush_lsn. If that slot is caught
up, we can assume others are too. The attached patch implements this
optimization. With the patch, the test with 200 slots finished in
2.512s. The execution time is now stable regardless of the number of
slots.

One thing to note is that DecodeTXNNeedSkip() also considers
replication origin filters. Theoretically, a plugin could filter out
specific origins, which might lead to different results. However, this
is a very rare case. Even if it happens, it would just result in a
false positive (the upgrade fails safely), so the impact is minimal.
Therefore, the patch simplifies the check to be per-database instead
of per-slot.

Feedback is very welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch application/octet-stream 8.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dagfinn Ilmari Mannsåker 2026-01-05 18:11:42 Re: Add "format" target to make and ninja to run pgindent and pgperltidy
Previous Message Marcos Magueta 2026-01-05 17:49:47 Re: WIP - xmlvalidate implementation from TODO list