Hi hackers,
I'd like to propose a patch that improves walsender performance when replaying
invalidation messages with a large number of relations.
Background
==========
The current RelfilenumberMapInvalidateCallback implementation performs a full
sequential scan of the hash table — even when invalidating a single specific
relation. In deployments with many tables (such as in multi-tenant systems),
this cache can grow to thousands of entries, making each invalidation callback
O(n). Since a single DDL (even an analyze operation) can generate multiple
invalidation messages, this can become a bottleneck for walsender throughput.
Solution
========
To address the issue, we introduce
1. A reverse hash table (`RelfilenumberReverseHash`): maps `relid -> RelfilenumberMapKey`,
enabling O(1) lookup and removal when a specific relation is invalidated.
2. A singly-linked list (`NegativeEntryList`): tracks negative cache entries
(relid == InvalidOid) separately.
Performance
===========
Test methodology: Create N tables, insert one row into each table, then
ALTER TABLE ADD COLUMN on each table. Measure wall-clock time for
`pg_recvlogical` (test_decoding plugin) to decode the WAL. DML and DDL
decode costs are measured separately using two replication slots. Built
with `-O3`, fsync=off, logical_decoding_work_mem='4GB' to isolate CPU cost.
Separate transactions (each ALTER TABLE is its own transaction):
| Tables | Before DDL (ms) | After DDL (ms) | Speedup |
|--------|----------------:|---------------:|---------:|
| 1,000 | 18 | 13 | 1.4x |
| 5,000 | 379 | 83 | 4.6x |
| 10,000 | 1,667 | 243 | 6.9x |
| 50,000 | 61,845 | 4,069 | 15.2x |
Single transaction (all INSERTs and ALTER TABLEs wrapped in one
BEGIN...COMMIT; DDL cost isolated by subtracting a DML-only transaction
of the same size):
| Tables | Before DDL (ms) | After DDL (ms) | Speedup |
|--------|----------------:|---------------:|---------:|
| 1,000 | 10 | 5 | 2.0x |
| 5,000 | 194 | 28 | 6.9x |
| 10,000 | 846 | 52 | 16.3x |
| 50,000 | 29,206 | 343 | 85.2x |
DML-only decode time is unchanged across all scales (no regression).
All regression tests pass.
Feedback welcome. The patch is attached.
Regards,
yangboyu