| From: | Nadav Shatz <nadav(at)tailorbrands(dot)com> |
|---|---|
| To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Cc: | pgpool-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Proposal: Recent mutated table tracking in memory |
| Date: | 2026-05-18 09:54:20 |
| Message-ID: | CACeKOO31db8_uiZj+6cKRZUBt+iDc82xhEwHeOh+7rHT6gf5mg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgpool-hackers |
Hi Tatsuo,
Any update on this proposal?
Thanks
On Thu, Apr 23, 2026 at 5:16 PM Nadav Shatz <nadav(at)tailorbrands(dot)com> wrote:
> Hi Tatsuo,
>
> Good catch on the 006.memqcache timeout. My previous fix had
> wrong side effects -- setting writing_transaction for
> dml_adaptive_global also changed routing behavior (it forced the
> whole transaction to primary, effectively reducing the feature to
> 'transaction' mode). That's what caused the hang.
>
> Fixed properly in v3: instead of touching writing_transaction,
> added a memqcache-specific guard that checks whether the current
> dml_adaptive* session has tracked writes in the current
> transaction, and skips the cache fetch if so.
>
> Attached: v3-0001-Feature-load-balancing-control-by-table-tracking.patch
>
> Changes in v3 vs v2:
>
> - pool_set_writing_transaction() reverted to original behavior
> (dml_adaptive_global no longer sets writing_transaction, so
> routing stays per-table as intended).
>
> - Added new helper pool_has_dml_adaptive_write_in_transaction()
> in pool_session_context.c. Returns true when the current session
> is in dml_adaptive* mode, is inside an explicit transaction, and
> has already tracked at least one write (via
> transaction_temp_write_list).
>
> - The two memqcache fetch guards in pool_proto_modules.c
> (simple query at line 270, extended query at line 1028) now
> also call !pool_has_dml_adaptive_write_in_transaction().
> Autocommit writes in dml_adaptive_global are still handled by
> the existing pool_invalidate_query_cache() at COMMIT time --
> no change needed there.
>
> Verified locally by mutating 006.memqcache with
> disable_load_balance_on_write = 'dml_adaptive_global' in the
> streaming replication mode (the only mode where dml_adaptive
> applies) and the jdbctest now correctly returns "2" instead of
> the stale cached "1". Both 006.memqcache and 043.track_table_mutation
> pass.
>
> Thanks!
>
> On Thu, Apr 23, 2026 at 11:14 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
> wrote:
>
>> Hi Nadav,
>>
>> Unfortunately the mutated 006.memqcache failed (timeout).
>>
>> >> > memqcache bug fix
>> >> > -----------------
>> >> >
>> >> > Good catch. The root cause: pool_set_writing_transaction() was
>> >> > explicitly skipping dml_adaptive_global, so
>> >> > pool_is_writing_transaction() always returned false in this mode.
>> >> > The query cache fetch guard at pool_proto_modules.c:270
>> >> > (!pool_is_writing_transaction()) then served stale cached results
>> >> > after DML in the same transaction.
>> >> >
>> >> > Fix: pool_set_writing_transaction() now sets the flag for
>> >> > dml_adaptive_global (only 'off' and 'dml_adaptive' skip it). This
>> >> > ensures the query cache is properly bypassed after writes within
>> >> > the same transaction.
>>
>> Regards,
>> --
>> Tatsuo Ishii
>> SRA OSS K.K.
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>
>> > Hi Tatsuo,
>> >
>> > Rebased onto current master, renumbered the regression tests
>> > (043/044 to avoid collision with the new 042.ssl_reload), and
>> > combined everything into a single commit.
>> >
>> > Attached: v2-0001-Feature-load-balancing-control-by-table-tracking.patch
>> >
>> > Looking forward to your review.
>> >
>> >
>> > On Sun, Apr 19, 2026 at 10:25 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> wrote:
>> >
>> >> > Hi Tatsuo,
>> >> >
>> >> > hank you for the detailed review. Attached patch addresses all items.
>> >>
>> >> I guess the attached patch is on top of
>> >> v1-0001-Feature-load-balancing-control-by-table-tracking.patch. To
>> >> apply v2-0001-address-review.patch, we need to apply
>> >> v1-0001-Feature-load-balancing-control-by-table-tracking.patch first.
>> >> Unfortunately due to recent commit, it does not apply anymore. Can you
>> >> please provide v1 + v2 that are rebased against latest master branch?
>> >> Also 042 regression test is already used by recent commit. Can you
>> >> renumber 042.track_table_mutation and
>> >> 043.track_table_mutation_watchdog to 043.track_table_mutation and
>> >> 044.track_table_mutation_watchdog accordingly?
>> >>
>> >> Looking forward to seeing new patch.
>> >>
>> >> Regards,
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS K.K.
>> >> English: http://www.sraoss.co.jp/index_en/
>> >> Japanese:http://www.sraoss.co.jp
>> >>
>> >>
>> >> > memqcache bug fix
>> >> > -----------------
>> >> >
>> >> > Good catch. The root cause: pool_set_writing_transaction() was
>> >> > explicitly skipping dml_adaptive_global, so
>> >> > pool_is_writing_transaction() always returned false in this mode.
>> >> > The query cache fetch guard at pool_proto_modules.c:270
>> >> > (!pool_is_writing_transaction()) then served stale cached results
>> >> > after DML in the same transaction.
>> >> >
>> >> > Fix: pool_set_writing_transaction() now sets the flag for
>> >> > dml_adaptive_global (only 'off' and 'dml_adaptive' skip it). This
>> >> > ensures the query cache is properly bypassed after writes within
>> >> > the same transaction.
>> >> >
>> >> > Removed dead query parse cache code (~700 lines)
>> >> > -------------------------------------------------
>> >> >
>> >> > You're right -- pool_track_table_mutation_get_cached_parse,
>> >> > pool_track_table_mutation_cache_parse, and
>> >> > pool_track_table_mutation_normalize_and_hash were never called.
>> >> > These were leftover from an earlier design where we planned to
>> >> > cache SQL parse results in shared memory. The feature ended up
>> >> > using pgpool's existing parser directly, and this code was never
>> >> > wired up.
>> >> >
>> >> > Removed: QueryParseCache and QueryParseEntry structs, all related
>> >> > static functions, the TRACK_TABLE_MUTATION_QUERY_SEM semaphore,
>> >> > and the track_table_mutation_query_buckets /
>> >> > track_table_mutation_query_parse_cache_size configuration
>> >> > parameters. This also reduces shared memory usage from ~6.4 MB
>> >> > to ~80 KB with default settings.
>> >> >
>> >> > check_object_relationship_list scope
>> >> > -------------------------------------
>> >> >
>> >> > You're correct -- dml_adaptive_global does not use
>> >> > dml_adaptive_object_relationship_list. Changed
>> >> > check_object_relationship_list() to check for DLBOW_DML_ADAPTIVE
>> >> > only, not DLBOW_IS_DML_ADAPTIVE (which includes global).
>> >> >
>> >> > Documentation fixes
>> >> > -------------------
>> >> >
>> >> > - Removed "(Lagless Replica Reads)" from section title and
>> >> > "lagless" language from description.
>> >> >
>> >> > - Described fallback behavior when neither
>> >> > replication_delay_source_cmd nor delay_threshold_by_time is
>> >> > configured (TTL stays at 100ms default minimum).
>> >> >
>> >> > - "query cache" references removed (the query parse cache is gone).
>> >> >
>> >> > - Added 128-table-per-SELECT limit to Limitations section
>> >> > (uses POOL_MAX_SELECT_OIDS).
>> >> >
>> >> > Code style fixes
>> >> > ----------------
>> >> >
>> >> > - DLBOW_IS_DML_ADAPTIVE() calls no longer split across lines.
>> >> >
>> >> > - Split the long errmsg line in
>> >> > is_select_object_in_temp_write_list.
>> >> >
>> >> > - Removed redundant is_adaptive variable in
>> >> > is_select_object_in_temp_write_list (the check at function
>> >> > entry already guarantees it).
>> >> >
>> >> > Thanks!
>> >> >
>> >> > On Wed, Apr 15, 2026 at 1:43 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> >> wrote:
>> >> >
>> >> >> Hi Nadav,
>> >> >>
>> >> >> > Hi Tatsuo,
>> >> >> >
>> >> >> > Looks good to me thanks!
>> >> >> >
>> >> >> > Please go ahead with your review. waiting to hear back from you.
>> >> >>
>> >> >> Here are the code review results.
>> >> >>
>> >> >> diff --git a/doc/src/sgml/loadbalance.sgml
>> >> b/doc/src/sgml/loadbalance.sgml
>> >> >> index 9e1e7b39b..7384ce81a 100644
>> >> >> --- a/doc/src/sgml/loadbalance.sgml
>> >> >> +++ b/doc/src/sgml/loadbalance.sgml
>> >> >> :
>> >> >> + <sect2 id="runtime-config-table-mutation-map">
>> >> >> + <title>Table Mutation Map Configuration (Lagless Replica
>> >> Reads)</title>
>> >> >>
>> >> >> "(Lagless Replica Reads)" sounds like an advertisement to me. It
>> >> >> should be removed.
>> >> >>
>> >> >> + <para>
>> >> >> + These parameters configure the track table mutation feature,
>> which
>> >> is
>> >> >> activated by setting
>> >> >> + <xref linkend="guc-disable-load-balance-on-write"> to
>> >> >> <literal>dml_adaptive_global</literal>.
>> >> >> + The feature tracks recently written tables to prevent stale
>> reads
>> >> from
>> >> >> replica nodes during
>> >> >> + replication lag, implementing the "lagless" architecture
>> pattern for
>> >> >> distributed systems
>> >> >> + with read replicas.
>> >> >>
>> >> >> I think the feature does not guarantee "lagless" anytime, in all
>> cases.
>> >> >>
>> >> >> + <para>
>> >> >> + This feature requires time-based replication delay monitoring.
>> This
>> >> >> can be provided by either
>> >> >> + <xref linkend="guc-replication-delay-source-cmd"> (external
>> command
>> >> >> mode) or by setting
>> >> >> + <xref linkend="guc-delay-threshold-by-time"> (which uses
>> >> >> <literal>pg_stat_replication.replay_lag</literal>
>> >> >> + from PostgreSQL 10+). At least one of these must be configured
>> for
>> >> the
>> >> >> TTL calculation to work.
>> >> >>
>> >> >> If one of these is not set, what happens? Error? Need to describe
>> it.
>> >> >>
>> >> >> + </para>
>> >> >> +
>> >> >> + <warning>
>> >> >> + <para>
>> >> >> + Enabling <literal>dml_adaptive_global</literal> increases
>> shared
>> >> >> memory consumption. With default settings,
>> >> >> + the feature requires approximately 6.4 MB of shared memory
>> (0.1 MB
>> >> >> for table tracking + 6.3 MB for query cache).
>> >> >>
>> >> >> "query cache" should be "query parse cache".
>> >> >>
>> >> >> + Memory usage scales with configuration parameters:
>> >> >> + </para>
>> >> >> + <itemizedlist>
>> >> >> + <listitem>
>> >> >> + <para>
>> >> >> + Table tracking: <literal>track_table_mutation_table_size * 40
>> >> >> bytes</literal> (default: 2048 * 40 = ~80 KB)
>> >> >> + </para>
>> >> >> + </listitem>
>> >> >> + <listitem>
>> >> >> + <para>
>> >> >> + Query cache:
>> >> <literal>track_table_mutation_query_parse_cache_size *
>> >> >> 640 bytes</literal> (default: 10000 * 640 = ~6.3 MB)
>> >> >>
>> >> >> "query cache" should be "query parse cache".
>> >> >>
>> >> >> + <title>Limitations</title>
>> >> >>
>> >> >> I think number of tables tacked in a SELECT is limited to 8. It
>> should
>> >> >> be mentioned.
>> >> >>
>> >> >> diff --git a/src/context/pool_query_context.c
>> >> >> b/src/context/pool_query_context.c
>> >> >> index a056ac596..0190d3673 100644
>> >> >> --- a/src/context/pool_query_context.c
>> >> >> +++ b/src/context/pool_query_context.c
>> >> >> @@ -1828,15 +1829,23 @@ is_in_list(char *name, List *list)
>> >> >> static bool
>> >> >> is_select_object_in_temp_write_list(Node *node, void *context)
>> >> >> {
>> >> >> - if (node == NULL ||
>> pool_config->disable_load_balance_on_write
>> >> !=
>> >> >> DLBOW_DML_ADAPTIVE)
>> >> >> + if (node == NULL ||
>> >> >> + !DLBOW_IS_DML_ADAPTIVE(
>> >> >> +
>> >> >> pool_config->disable_load_balance_on_write))
>> >> >>
>> >> >> You don't need to split the line.
>> >> >>
>> >> >> + is_adaptive = DLBOW_IS_DML_ADAPTIVE(
>> >> >> +
>> >> >> pool_config->disable_load_balance_on_write);
>> >> >>
>> >> >> You don't need to split the line.
>> >> >>
>> >> >> - if (pool_config->disable_load_balance_on_write ==
>> >> >> DLBOW_DML_ADAPTIVE && session_context->is_in_transaction)
>> >> >> + if (is_adaptive &&
>> >> >> + session_context->is_in_transaction)
>> >> >> {
>> >> >> ereport(DEBUG1,
>> >> >>
>> >> >> (errmsg("is_select_object_in_temp_write_list: \"%s\", found relation
>> >> >> \"%s\"", (char *) context, rgv->relname)));
>> >> >> This line is too long. Please split.
>> >> >>
>> >> >> @@ -1880,7 +1889,13 @@ static char
>> >> >> *get_associated_object_from_dml_adaptive_relations
>> >> >> void
>> >> >> check_object_relationship_list(char *name, bool is_func_name)
>> >> >> {
>> >> >> - if (pool_config->disable_load_balance_on_write ==
>> >> >> DLBOW_DML_ADAPTIVE &&
>> >> >> pool_config->parsed_dml_adaptive_object_relationship_list)
>> >> >> + bool is_adaptive;
>> >> >> +
>> >> >> + is_adaptive = DLBOW_IS_DML_ADAPTIVE(
>> >> >> +
>> >> >> pool_config->disable_load_balance_on_write);
>> >> >>
>> >> >> I wrote in the commit message:
>> >> >>
>> >> >> modifications are only detected in the same transaction). Note,
>> >> >> however, you cannot use dml_adaptive_object_relationship_list to
>> track
>> >> >> dependency among table and other objects.
>> >> >>
>> >> >> In my understanding the feature does not use
>> >> >> dml_adaptive_object_relationship_list. If this is correct, why
>> >> >> check_object_relationship_list() is called here in case
>> >> >> dml_adaptive_global? If the feature uses
>> >> >> dml_adaptive_object_relationship_list, test cases should be
>> included.
>> >> >>
>> >> >> diff --git a/src/utils/pool_track_table_mutation.c
>> >> >> b/src/utils/pool_track_table_mutation.c
>> >> >> new file mode 100644
>> >> >> index 000000000..9be46b28f
>> >> >> --- /dev/null
>> >> >> +++ b/src/utils/pool_track_table_mutation.c
>> >> >>
>> >> >> It seems following functions are not used anywhere. I wonder if this
>> >> >> feature actually use "query parse cache".
>> >> >>
>> >> >> pool_track_table_mutation_get_cached_parse
>> >> >> pool_track_table_mutation_cache_parse
>> >> >> pool_track_table_mutation_normalize_and_hash
>> >> >>
>> >> >> Besides the code review, I mutated one of regression tests to check
>> >> >> whether the feature co exists with in the existing memory query
>> cache
>> >> >> feature. After attached patch applied, I ran 006.memqcache and got
>> the
>> >> >> following result.
>> >> >>
>> >> >> cd src/test/regression
>> >> >> ./regress.sh 006
>> >> >> creating pgpool-II temporary installation ...
>> >> >> moving pgpool_setup to temporary installation path ...
>> >> >> moving watchdog_setup to temporary installation path ...
>> >> >> using pgpool-II at
>> >> >>
>> >>
>> /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/temp/installed
>> >> >> *************************
>> >> >> REGRESSION MODE : install
>> >> >> Pgpool-II version : pgpool-II version 4.8devel
>> (mitsukakeboshi)
>> >> >> Pgpool-II install path :
>> >> >>
>> >>
>> /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/temp/installed
>> >> >> PostgreSQL bin : /usr/local/pgsql/bin
>> >> >> PostgreSQL Major version : 18
>> >> >> pgbench : /usr/local/pgsql/bin/pgbench
>> >> >> PostgreSQL jdbc :
>> >> >> /usr/local/pgsql/share/postgresql-9.2-1003.jdbc4.jar
>> >> >> *************************
>> >> >> testing 006.memqcache...failed.
>> >> >> out of 1 ok:0 failed:1 timeout:0
>> >> >>
>> >> >> log/006.memqcache shows:
>> >> >>
>> >> >> ../expected.txt result.txt differ: char 1, line 1
>> >> >>
>> >> >> So I checked the test script and found the error was generated by a
>> >> >> Java program test.
>> >> >>
>> >> >> java jdbctest > result.txt 2>&1
>> >> >> cmp ../expected.txt result.txt
>> >> >> if [ $? != 0 ];then
>> >> >> ./shutdownall
>> >> >> exit 1
>> >> >> fi
>> >> >>
>> >> >> In jdbctest.java:
>> >> >>
>> >> >> /*
>> >> >> * Cache test in an explicit transaction
>> >> >> */
>> >> >> conn.setAutoCommit(false);
>> >> >> // execute DML. This should prevent SELECTs from
>> using
>> >> >> query cache in the transaction.
>> >> >> sql = "UPDATE t1 SET i = 2;";
>> >> >> pst = conn.createStatement();
>> >> >> pst.executeUpdate(sql);
>> >> >> pst.close();
>> >> >> // should not use the cache and should return "2",
>> >> rather
>> >> >> than "1"
>> >> >> prest = conn.prepareStatement("SELECT * FROM t1");
>> >> >> rs = prest.executeQuery();
>> >> >>
>> >> >> The expected file (expected.txt) has "2" but the result file
>> >> >> (testdir/result.txt) was "1". This is the reason why the test
>> >> >> failed. I wonder if there's something wrong with the feature when
>> the
>> >> >> query cache is enabled. Can you look into this?
>> >> >>
>> >> >> Regards,
>> >> >> --
>> >> >> Tatsuo Ishii
>> >> >> SRA OSS K.K.
>> >> >> English: http://www.sraoss.co.jp/index_en/
>> >> >> Japanese:http://www.sraoss.co.jp
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Nadav Shatz
>> >> > Tailor Brands | CTO
>> >>
>> >
>> >
>> > --
>> > Nadav Shatz
>> > Tailor Brands | CTO
>>
>
>
> --
> Nadav Shatz
> Tailor Brands | CTO
>
--
Nadav Shatz
Tailor Brands | CTO
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2026-05-18 10:11:05 | Re: Proposal: Recent mutated table tracking in memory |
| Previous Message | Koshino Taiki | 2026-05-13 02:46:14 | Notification of updated Pgpool-II download URLs |