Re: [PATCH] Improving index selection for logical replication apply with replica identity full

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Ethan Mertz <ethan(dot)mertz(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "onderkalaci(at)gmail(dot)com" <onderkalaci(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: [PATCH] Improving index selection for logical replication apply with replica identity full
Date: 2026-06-16 02:55:14
Message-ID: CALj2ACW1ax_6UNnt_N5fT=Puiq6SODhMB7n9W6zs-RBt6sB74w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, May 29, 2026 at 6:43 AM Ethan Mertz <ethan(dot)mertz(at)gmail(dot)com> wrote:
>
> I have amended the patch to include a comment explaining the behavior for
> non-unique indexes.
>
> Without invoking the planner, I figure it would be difficult to reason about
> performance of a scan on a non-unique index. I think a beneficial future
> optimization would allow users to selectively invoke the planner for their
> logical apply processes. (Possibly a new configuration in the subscription).
>
> Attached is the updated patch.

Thanks for the patch. The results look nice - going from 53 seconds to
under 1 second for 1,000 point updates is a significant improvement.

I'm curious to know a bit more about the use-case. With REPLICA
IDENTITY FULL, the publisher WAL-logs the full before-image of rows
for UPDATE and DELETE operations, which means more WAL generation on
the publisher side. What's the motivation for users to choose REPLICA
IDENTITY FULL despite this overhead? Is it primarily to support
downstream logical consumers that need full row images? If yes, what
are those downstream consumers doing with the full row images?

Also, I think the heuristic could go beyond just unique vs. non-unique
index preference. Factors like index bloat and index size could also
help make a better choice. For example, say there are two unique
indexes on a table - one on a few text columns and another on a bigint
column. Due to non-HOT updates or updates to the indexed columns
(which can happen both on the publisher and the subscriber), bloat on
the text index grows faster compared to the bigint index. On the
subscriber, if the index with more bloat is chosen, lookups become
slower due to more pages to traverse - even though both are unique
indexes that the heuristic would treat equally. Of course, fully
accounting for all these factors may amount to invoking the planner -
but even a simple check on index size (relpages) between
otherwise-equal candidates could help.

Some comments on the patch.

1/
+ q{select (idx_scan = 1) from pg_stat_all_indexes where indexrelname
= 'test_idx_select_uniq';}

Use pg_stat_force_next_flush before reading the stats.

2/
+ q{select (idx_scan = 1) from pg_stat_all_indexes where indexrelname
= 'test_idx_select_uniq';}

How about making this more predictable - instead of relying on the
scan count being just 1, capture idx_scan before and after, then
compare (after > before or after = before + 1). Although there are no
concurrent queries on these tables, it makes the test predictable.

3/
+ * If multiple usable indexes exist, unique indexes are preferred (they
+ * guarantee at most one tuple per scan), and among unique indexes those with
+ * fewer key columns win. The first usable non-unique index is accepted
+ * without further ranking.

Comment before the function and inside the function seems redundant -
can we dedup into one comment?

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Corey Huinker 2026-06-16 02:53:22 Re: pg_restore handles extended statistics inconsistently with statistics data