Optimize RelfilenumberMapInvalidateCallback for logical decoding performance

From: "yangboyu" <yangboyu(dot)yby(at)alibaba-inc(dot)com>
To: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Subject: Optimize RelfilenumberMapInvalidateCallback for logical decoding performance
Date: 2026-06-03 08:13:44
Message-ID: 5739ba24-6fe7-4559-9ed1-7fbe690071f0.yangboyu.yby@alibaba-inc.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,
I'd like to propose a patch that improves walsender performance when replaying
invalidation messages with a large number of relations.
Background
==========
The current RelfilenumberMapInvalidateCallback implementation performs a full
sequential scan of the hash table — even when invalidating a single specific
relation. In deployments with many tables (such as in multi-tenant systems),
this cache can grow to thousands of entries, making each invalidation callback
O(n). Since a single DDL (even an analyze operation) can generate multiple
invalidation messages, this can become a bottleneck for walsender throughput.
Solution
========
To address the issue, we introduce
1. A reverse hash table (`RelfilenumberReverseHash`): maps `relid -> RelfilenumberMapKey`,
enabling O(1) lookup and removal when a specific relation is invalidated.
2. A singly-linked list (`NegativeEntryList`): tracks negative cache entries
(relid == InvalidOid) separately.
Performance
===========
Test methodology: Create N tables, insert one row into each table, then
ALTER TABLE ADD COLUMN on each table. Measure wall-clock time for
`pg_recvlogical` (test_decoding plugin) to decode the WAL. DML and DDL
decode costs are measured separately using two replication slots. Built
with `-O3`, fsync=off, logical_decoding_work_mem='4GB' to isolate CPU cost.
Separate transactions (each ALTER TABLE is its own transaction):
| Tables | Before DDL (ms) | After DDL (ms) | Speedup |
|--------|----------------:|---------------:|---------:|
| 1,000 | 18 | 13 | 1.4x |
| 5,000 | 379 | 83 | 4.6x |
| 10,000 | 1,667 | 243 | 6.9x |
| 50,000 | 61,845 | 4,069 | 15.2x |
Single transaction (all INSERTs and ALTER TABLEs wrapped in one
BEGIN...COMMIT; DDL cost isolated by subtracting a DML-only transaction
of the same size):
| Tables | Before DDL (ms) | After DDL (ms) | Speedup |
|--------|----------------:|---------------:|---------:|
| 1,000 | 10 | 5 | 2.0x |
| 5,000 | 194 | 28 | 6.9x |
| 10,000 | 846 | 52 | 16.3x |
| 50,000 | 29,206 | 343 | 85.2x |
DML-only decode time is unchanged across all scales (no regression).
All regression tests pass.
Feedback welcome. The patch is attached.
Regards,
yangboyu

Attachment Content-Type Size
v1-0001-accelerate_invalidation_message_replay_in_logical_decoding.patch application/octet-stream 6.5 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2026-06-03 08:14:47 Re: Use TEXT_DUMP_HEADER macro instead of hard-coded string in pg_backup_archiver.c
Previous Message Fujii Masao 2026-06-03 08:13:37 Re: ECPG: inconsistent behavior with the document in “GET/SET DESCRIPTOR.”