Re: Proposal: Recent mutated table tracking in memory

From: Nadav Shatz <nadav(at)tailorbrands(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgpool-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Proposal: Recent mutated table tracking in memory
Date: 2026-05-24 17:00:47
Message-ID: CACeKOO0-ctbkHnV978Hz6xF44R_9AGG6StjEH-V=M9k-VcyxQg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-hackers

Hi Tatsuo,

Your really_writing_transaction approach is the right fix -- it
addresses the root cause across all DLBOW modes, not just ours.
Thanks for digging into it.

I applied your v1 patch and rebased our feature on top. Attaching
both patches separately so they can land independently in the
order you prefer:

v5-0001-Fix-disable_load_balance_on_write-and-query-cache.patch
-- your patch unchanged (just rebased to apply cleanly on
current master without our feature underneath).

v5-0002-Feature-load-balancing-control-by-table-tracking.patch
-- our feature, on top of your fix.

Changes in v5-0002 vs v4:

- Dropped pool_has_dml_adaptive_write_in_transaction() helper and
the matching pool_has_dml_table_oids() exposure. The cache
fetch guards in pool_proto_modules.c now correctly use
pool_is_really_writing_transaction() from your patch, so the
helper became redundant.

- Kept the MAIN_REPLICA gate in CommandComplete.c for the
autocommit mark-stale branch. dml_adaptive_global is only
meaningful in streaming replication mode (matches the routing
logic in where_to_send_main_replica), and gating prevents the
hang we saw in native_replication where the autocommit branch
could run while an explicit transaction was actually in flight
on the backend.

I tried to run 006.memqcache with the mutation against the
combined branch but local master is currently broken (commit
2ae004a48 as you noted), so the standby setup fails before
reaching the jdbctest part. Both patches build cleanly and our
043.track_table_mutation passes on an earlier base. Will retest
once master is unbroken.

Thanks!

On Sat, May 23, 2026 at 2:18 PM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:

> > I think you are talking about the logic to judge whether we are in an
> > explicite transaction or not here. Current dml_adaptive checks
> > supplied query is a transaction starting command like BEGIN. IMO this
> > is fundamentaly wrong because the command may fail by various reasons.
> > The correct way is checking transaction state by using TSTATE
> > macro. Note that the macro can only be used at leat one ready for
> > query response returned from backend (simple query protocol case), or
> > command complete response is returned from backend (extended query
> > protocol case).
> >
> >> In native replication and snapshot isolation modes,
> >> dml_adaptive() is never called (it lives inside
> >> where_to_send_main_replica), so is_in_transaction is never set
> >> to true even inside an explicit BEGIN/COMMIT block. That meant
> >> every DML in those modes was treated as autocommit by the
> >> write-tracking code, triggering
> >> pool_track_table_mutation_get_database_oid() ― which does a
> >> relcache do_query ― while a transaction was actually in flight
> >> on the backend connection. The do_query conflicts with the
> >> in-flight transaction and hangs the session.
> >
> > Assuming "a transaction was actually in flight" means a transaction
> > was open (explicit transaction), not really. do_query can be called
> > inside or outside of an explicit transaction.
> >
> > Anyway, I found dml_adaptive is completely broken (it brings wrong
> > results if query cache enabled). Unless there are users for the
> > feature, maybe we should remove dml_adaptive entirely?
>
> It appears that other options of disable_load_balance_on_write are all
> broken too, except "transaction". I don't want to discard all of them,
> so I come up with attached patch.
>
> The query cache relies on is_writing_transaction of session context to
> judge whether cache can be safely used. However,
> disable_load_balance_on_write overrides it to true when it should not,
> and vice versa for its own purpose. To fix this, a new session context
> variable "really_writing_transaction" is introduced. It is almost same
> as existing writing_transaction, but it faithfully tracks whether a
> writing query appears in an explicit transaction. The query cache uses
> it instead of writing_transaction variable.
>
> Currently, master branch is broken because of commit 2ae004a48. If
> you want to try the patch, I recommend to checkout 48e1d6d3c, then
> apply the patch.
>
> Regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>

--
Nadav Shatz
Tailor Brands | CTO

Attachment Content-Type Size
v5-0001-Fix-disable_load_balance_on_write-and-query-cache.patch application/octet-stream 8.3 KB
v5-0002-Feature-load-balancing-control-by-table-tracking.patch application/octet-stream 90.1 KB

In response to

Browse pgpool-hackers by date

  From Date Subject
Next Message Koshino Taiki 2026-05-27 01:04:29 Pgpool-II Release Date Postponement
Previous Message Tatsuo Ishii 2026-05-23 11:18:16 Re: Proposal: Recent mutated table tracking in memory