Re: Proposal: Recent mutated table tracking in memory

From: Nadav Shatz <nadav(at)tailorbrands(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgpool-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Proposal: Recent mutated table tracking in memory
Date: 2026-02-12 09:05:19
Message-ID: CACeKOO21jgrafj6RzO+ibLjBDUdDeQ0Tuu3tatkmGUdQi+GAAQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-hackers

Hi Tatsuo,

Thank you for the careful review. You raised an important concern. I've
addressed it in the updated patch — here's the explanation:

The attack scenario you describe is now handled. In the updated patch,
writes inside explicit transactions are only flushed to the shared-memory
table map at COMMIT time. If the transaction is rolled back, the table is
never marked as stale. So the attack pattern:

BEGIN;
UPDATE t1 SET i = 1 WHERE FALSE;
ROLLBACK;

has zero effect on the shared-memory table map. The dml_adaptive_global
mode piggybacks on the existing dml_adaptive per-transaction write list
(transaction_temp_write_list). On COMMIT, the accumulated table names are
resolved to OIDs and flushed to shared memory. On ROLLBACK,
the list is simply discarded (the existing dml_adaptive behavior).

For autocommit statements (outside explicit transactions), tables are
marked immediately — but in that case the write is committed, so this is
correct.

Regression test included. Test 042 now includes:
- Test 10: verifies that BEGIN; INSERT; ROLLBACK; SELECT does NOT route
the SELECT to primary
- Test 11: verifies that BEGIN; INSERT; COMMIT; SELECT DOES route the
SELECT to primary

Additional context on the threat model:

1. This feature requires disable_load_balance_on_write =
'dml_adaptive_global' — it is opt-in, not enabled by default. Operators who
enable it accept documented trade-offs (additional shared memory, TTL-based
staleness window).
2. An attacker who can connect and execute SQL against pgpool already has
the ability to cause far more damage (DROP TABLE, mass DELETEs, resource
exhaustion via expensive queries, connection flooding, etc.). The
table-marking via committed writes is a minor concern compared to
those vectors. Authentication, connection limits, and network security
are the appropriate defenses at that layer.
3. Even in the worst case (an attacker commits real writes in a loop),
the impact is bounded: the stale marking is temporary (TTL-based, typically
a few seconds), and only affects load-balancing decisions — it doesn't
cause data loss or correctness issues.
4. The existing dml_adaptive mode has analogous behavior: within a
transaction, a write to table T causes all reads of T to go to primary for
the remainder of that transaction. The only difference is scope —
dml_adaptive_global extends this across sessions with a TTL.

Thanks!

On Wed, Feb 11, 2026 at 12:28 PM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:

> > Hi Tatsuo,
> >
> > After reading more about disable_load_balance_on_write=dml_adaptive i
> came
> > to the thought that this feature is actually an "extension" of that since
> > it covers "global" and not just per transaction behavior. in any case i
> > think it makes more sense that it sits under
> > the disable_load_balance_on_write and not as a standalone for clarity.
> >
> > I'm attaching below an updated patch with these adjustments.
> >
> > Please let me know what you think.
>
> I worry about the transactional behavior with the patch:
>
> + This means that if a transaction is rolled back, the table remains
> marked as stale until
> + the TTL expires, even though no actual data modification occurred.
> This is by design:
>
> This allows attackers to issue simple command continuously to
> effectively disable load balance (and increase the load of primary) in
> whole system:
>
> BEGIN;
> UPDATE t1 SET i = 1 WHERE FALSE;
> ROLLBACK;
>
> I think if the patch allows that, we cannot accept the patch.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
> > On Fri, Feb 6, 2026 at 1:29 PM Nadav Shatz <nadav(at)tailorbrands(dot)com>
> wrote:
> >
> >> Hi Tatsuo,
> >>
> >> Thank you for all the great comments and questions! I took under
> >> consideration all of them either adding support/tests or detailing the
> >> limitations in the docs.
> >>
> >> Let me know what you think of the latest patch attached here
> >>
> >> On Wed, Feb 4, 2026 at 1:23 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
> wrote:
> >>
> >>> From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
> >>> Subject: Re: Proposal: Recent mutated table tracking in memory
> >>> Date: Tue, 03 Feb 2026 16:43:53 +0900 (JST)
> >>> Message-ID: <20260203(dot)164353(dot)362943818466117773(dot)ishii(at)postgresql(dot)org>
> >>>
> >>> > Hi Nadav,
> >>> >
> >>> > Thank you for updating the patch!
> >>> >
> >>> >> Thank you for the comments!
> >>> >>
> >>> >> I agree with all of them. Let me know what you think of the changes
> >>> and new
> >>> >> naming.
> >>> >
> >>> > I still think "memory_map" is too generic. Anything put on memory for
> >>> > data mapping could be called "memory map". I recommend to change the
> >>> > name to more feature specific one: What about replacing "memory_map"
> >>> > with "track_table_mutation"? It's a little bit longer name but it
> >>> > clearly represents the feature. Any better ideas are welcome.
> >>> >
> >>> > - memory_map_enabled: Enable/disable the feature (default: off)
> >>> > - memory_map_ttl_factor: TTL multiplier for replication delay
> (default:
> >>> 5.0)
> >>> > - memory_map_cold_start_duration: Cold start period in ms (default:
> >>> 2000)
> >>> > - memory_map_table_buckets: Hash buckets for table map (default:
> 1024)
> >>> > - memory_map_table_size: Max tracked tables (default: 2048)
> >>> > - memory_map_query_buckets: Hash buckets for query cache (default:
> 2048)
> >>> > - memory_map_query_cache_size: Max cached queries (default: 10000)
> >>> >
> >>> > Also I feel memory_map_query_cache_size is confusing because there's
> >>> > already "query cache" feature in pgpool. Can we change it something
> >>> > like "query_parse_cache_size"?
> >>> >
> >>> > Review comments:
> >>> >
> >>> > (1) Why the regression test is 45? Shouldn't it be 42? (the last
> >>> > feature test is 041.external_replication_delay).
> >>> >
> >>> > (2) You enhance the patch to deal with leader watch changing. That's
> >>> > good. However, I don't see a test case for it in test.sh.
> >>> >
> >>> > (3) It seems the patch does not support TRUNCATE, MERGE, PREPARE and
> >>> > WITH + updating. If so, it should be noted in the docs as a
> limitation
> >>> > of the feature.
> >>>
> >>> (4) It seems the patch does not consider transactions. If an UPDATE is
> >>> performed in a transaction and the transaction gets rollbacked, load
> >>> balance is disabled despite that fact that the table modification did
> >>> not happen.
> >>>
> >>> Best regards,
> >>> --
> >>> Tatsuo Ishii
> >>> SRA OSS K.K.
> >>> English: http://www.sraoss.co.jp/index_en/
> >>> Japanese:http://www.sraoss.co.jp
> >>>
> >>
> >>
> >> --
> >> Nadav Shatz
> >> Tailor Brands | CTO
> >>
> >
> >
> > --
> > Nadav Shatz
> > Tailor Brands | CTO
>

--
Nadav Shatz
Tailor Brands | CTO

Attachment Content-Type Size
table_track.patch application/octet-stream 99.1 KB

In response to

Responses

Browse pgpool-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2026-02-18 23:51:21 Re: Proposal: Recent mutated table tracking in memory
Previous Message Tatsuo Ishii 2026-02-11 10:28:45 Re: Proposal: Recent mutated table tracking in memory