Re: pg_upgrade: optimize replication slot caught-up check

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: pg_upgrade: optimize replication slot caught-up check
Date: 2026-01-20 06:38:29
Message-ID: CAJpy0uD08J-tii7THfK0uAp+OarTn1PhDzUyiRbCa0eJ3cfwkQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> I've attached the updated patch.
>

Thank You for the patch. I like the idea of optimization. Few initial comments:

1)
+ * The query returns the slot names and their caught-up status in
+ * the same order as the results collected by
+ * get_old_cluster_logical_slot_infos(). If this query is changed,

I could not find the function get_old_cluster_logical_slot_infos(), do
you mean get_old_cluster_logical_slot_infos_query()?

2)
" WHERE database = current_database() AND "
" slot_type = 'logical' AND "

Is there a reason why database = current_database() is placed before
slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
executor will order these predicates, but from the first look,
slot_type = 'logical' appears cheaper and could be placed first,
consistent with the ordering used at other places.

3)
Shouldn’t we add a sanity check inside
get_old_cluster_logical_slot_infos_query() to ensure that when
skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
would make the function safer for future use if it's called elsewhere.
I understand the caller already performs a similar check, but I think
it's more appropriate here since we call
binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
even exist on newer versions.

4)
+# Check the file content. While both test_slot1 and test_slot2 should
be reporting
+# that they have unconsumed WAL records, test_slot3 should not be reported as
+# it has caught up.

Can you please elaborate the reason behind test_slot3 not being
reported? Also mention in the comment if possible.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-01-20 06:45:12 Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
Previous Message Daniil Davydov 2026-01-20 06:35:27 Fix comments for buf_id field of BufferDesc structure