Re: Proposal: Recent mutated table tracking in memory

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: nadav(at)tailorbrands(dot)com
Cc: pgpool-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Proposal: Recent mutated table tracking in memory
Date: 2026-04-23 08:14:08
Message-ID: 20260423.171408.1934918588656421138.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgpool-hackers

Hi Nadav,

Unfortunately the mutated 006.memqcache failed (timeout).

>> > memqcache bug fix
>> > -----------------
>> >
>> > Good catch. The root cause: pool_set_writing_transaction() was
>> > explicitly skipping dml_adaptive_global, so
>> > pool_is_writing_transaction() always returned false in this mode.
>> > The query cache fetch guard at pool_proto_modules.c:270
>> > (!pool_is_writing_transaction()) then served stale cached results
>> > after DML in the same transaction.
>> >
>> > Fix: pool_set_writing_transaction() now sets the flag for
>> > dml_adaptive_global (only 'off' and 'dml_adaptive' skip it). This
>> > ensures the query cache is properly bypassed after writes within
>> > the same transaction.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

> Hi Tatsuo,
>
> Rebased onto current master, renumbered the regression tests
> (043/044 to avoid collision with the new 042.ssl_reload), and
> combined everything into a single commit.
>
> Attached: v2-0001-Feature-load-balancing-control-by-table-tracking.patch
>
> Looking forward to your review.
>
>
> On Sun, Apr 19, 2026 at 10:25 AM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
>> > Hi Tatsuo,
>> >
>> > hank you for the detailed review. Attached patch addresses all items.
>>
>> I guess the attached patch is on top of
>> v1-0001-Feature-load-balancing-control-by-table-tracking.patch. To
>> apply v2-0001-address-review.patch, we need to apply
>> v1-0001-Feature-load-balancing-control-by-table-tracking.patch first.
>> Unfortunately due to recent commit, it does not apply anymore. Can you
>> please provide v1 + v2 that are rebased against latest master branch?
>> Also 042 regression test is already used by recent commit. Can you
>> renumber 042.track_table_mutation and
>> 043.track_table_mutation_watchdog to 043.track_table_mutation and
>> 044.track_table_mutation_watchdog accordingly?
>>
>> Looking forward to seeing new patch.
>>
>> Regards,
>> --
>> Tatsuo Ishii
>> SRA OSS K.K.
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>
>>
>> > memqcache bug fix
>> > -----------------
>> >
>> > Good catch. The root cause: pool_set_writing_transaction() was
>> > explicitly skipping dml_adaptive_global, so
>> > pool_is_writing_transaction() always returned false in this mode.
>> > The query cache fetch guard at pool_proto_modules.c:270
>> > (!pool_is_writing_transaction()) then served stale cached results
>> > after DML in the same transaction.
>> >
>> > Fix: pool_set_writing_transaction() now sets the flag for
>> > dml_adaptive_global (only 'off' and 'dml_adaptive' skip it). This
>> > ensures the query cache is properly bypassed after writes within
>> > the same transaction.
>> >
>> > Removed dead query parse cache code (~700 lines)
>> > -------------------------------------------------
>> >
>> > You're right -- pool_track_table_mutation_get_cached_parse,
>> > pool_track_table_mutation_cache_parse, and
>> > pool_track_table_mutation_normalize_and_hash were never called.
>> > These were leftover from an earlier design where we planned to
>> > cache SQL parse results in shared memory. The feature ended up
>> > using pgpool's existing parser directly, and this code was never
>> > wired up.
>> >
>> > Removed: QueryParseCache and QueryParseEntry structs, all related
>> > static functions, the TRACK_TABLE_MUTATION_QUERY_SEM semaphore,
>> > and the track_table_mutation_query_buckets /
>> > track_table_mutation_query_parse_cache_size configuration
>> > parameters. This also reduces shared memory usage from ~6.4 MB
>> > to ~80 KB with default settings.
>> >
>> > check_object_relationship_list scope
>> > -------------------------------------
>> >
>> > You're correct -- dml_adaptive_global does not use
>> > dml_adaptive_object_relationship_list. Changed
>> > check_object_relationship_list() to check for DLBOW_DML_ADAPTIVE
>> > only, not DLBOW_IS_DML_ADAPTIVE (which includes global).
>> >
>> > Documentation fixes
>> > -------------------
>> >
>> > - Removed "(Lagless Replica Reads)" from section title and
>> > "lagless" language from description.
>> >
>> > - Described fallback behavior when neither
>> > replication_delay_source_cmd nor delay_threshold_by_time is
>> > configured (TTL stays at 100ms default minimum).
>> >
>> > - "query cache" references removed (the query parse cache is gone).
>> >
>> > - Added 128-table-per-SELECT limit to Limitations section
>> > (uses POOL_MAX_SELECT_OIDS).
>> >
>> > Code style fixes
>> > ----------------
>> >
>> > - DLBOW_IS_DML_ADAPTIVE() calls no longer split across lines.
>> >
>> > - Split the long errmsg line in
>> > is_select_object_in_temp_write_list.
>> >
>> > - Removed redundant is_adaptive variable in
>> > is_select_object_in_temp_write_list (the check at function
>> > entry already guarantees it).
>> >
>> > Thanks!
>> >
>> > On Wed, Apr 15, 2026 at 1:43 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> wrote:
>> >
>> >> Hi Nadav,
>> >>
>> >> > Hi Tatsuo,
>> >> >
>> >> > Looks good to me thanks!
>> >> >
>> >> > Please go ahead with your review. waiting to hear back from you.
>> >>
>> >> Here are the code review results.
>> >>
>> >> diff --git a/doc/src/sgml/loadbalance.sgml
>> b/doc/src/sgml/loadbalance.sgml
>> >> index 9e1e7b39b..7384ce81a 100644
>> >> --- a/doc/src/sgml/loadbalance.sgml
>> >> +++ b/doc/src/sgml/loadbalance.sgml
>> >> :
>> >> + <sect2 id="runtime-config-table-mutation-map">
>> >> + <title>Table Mutation Map Configuration (Lagless Replica
>> Reads)</title>
>> >>
>> >> "(Lagless Replica Reads)" sounds like an advertisement to me. It
>> >> should be removed.
>> >>
>> >> + <para>
>> >> + These parameters configure the track table mutation feature, which
>> is
>> >> activated by setting
>> >> + <xref linkend="guc-disable-load-balance-on-write"> to
>> >> <literal>dml_adaptive_global</literal>.
>> >> + The feature tracks recently written tables to prevent stale reads
>> from
>> >> replica nodes during
>> >> + replication lag, implementing the "lagless" architecture pattern for
>> >> distributed systems
>> >> + with read replicas.
>> >>
>> >> I think the feature does not guarantee "lagless" anytime, in all cases.
>> >>
>> >> + <para>
>> >> + This feature requires time-based replication delay monitoring. This
>> >> can be provided by either
>> >> + <xref linkend="guc-replication-delay-source-cmd"> (external command
>> >> mode) or by setting
>> >> + <xref linkend="guc-delay-threshold-by-time"> (which uses
>> >> <literal>pg_stat_replication.replay_lag</literal>
>> >> + from PostgreSQL 10+). At least one of these must be configured for
>> the
>> >> TTL calculation to work.
>> >>
>> >> If one of these is not set, what happens? Error? Need to describe it.
>> >>
>> >> + </para>
>> >> +
>> >> + <warning>
>> >> + <para>
>> >> + Enabling <literal>dml_adaptive_global</literal> increases shared
>> >> memory consumption. With default settings,
>> >> + the feature requires approximately 6.4 MB of shared memory (0.1 MB
>> >> for table tracking + 6.3 MB for query cache).
>> >>
>> >> "query cache" should be "query parse cache".
>> >>
>> >> + Memory usage scales with configuration parameters:
>> >> + </para>
>> >> + <itemizedlist>
>> >> + <listitem>
>> >> + <para>
>> >> + Table tracking: <literal>track_table_mutation_table_size * 40
>> >> bytes</literal> (default: 2048 * 40 = ~80 KB)
>> >> + </para>
>> >> + </listitem>
>> >> + <listitem>
>> >> + <para>
>> >> + Query cache:
>> <literal>track_table_mutation_query_parse_cache_size *
>> >> 640 bytes</literal> (default: 10000 * 640 = ~6.3 MB)
>> >>
>> >> "query cache" should be "query parse cache".
>> >>
>> >> + <title>Limitations</title>
>> >>
>> >> I think number of tables tacked in a SELECT is limited to 8. It should
>> >> be mentioned.
>> >>
>> >> diff --git a/src/context/pool_query_context.c
>> >> b/src/context/pool_query_context.c
>> >> index a056ac596..0190d3673 100644
>> >> --- a/src/context/pool_query_context.c
>> >> +++ b/src/context/pool_query_context.c
>> >> @@ -1828,15 +1829,23 @@ is_in_list(char *name, List *list)
>> >> static bool
>> >> is_select_object_in_temp_write_list(Node *node, void *context)
>> >> {
>> >> - if (node == NULL || pool_config->disable_load_balance_on_write
>> !=
>> >> DLBOW_DML_ADAPTIVE)
>> >> + if (node == NULL ||
>> >> + !DLBOW_IS_DML_ADAPTIVE(
>> >> +
>> >> pool_config->disable_load_balance_on_write))
>> >>
>> >> You don't need to split the line.
>> >>
>> >> + is_adaptive = DLBOW_IS_DML_ADAPTIVE(
>> >> +
>> >> pool_config->disable_load_balance_on_write);
>> >>
>> >> You don't need to split the line.
>> >>
>> >> - if (pool_config->disable_load_balance_on_write ==
>> >> DLBOW_DML_ADAPTIVE && session_context->is_in_transaction)
>> >> + if (is_adaptive &&
>> >> + session_context->is_in_transaction)
>> >> {
>> >> ereport(DEBUG1,
>> >>
>> >> (errmsg("is_select_object_in_temp_write_list: \"%s\", found relation
>> >> \"%s\"", (char *) context, rgv->relname)));
>> >> This line is too long. Please split.
>> >>
>> >> @@ -1880,7 +1889,13 @@ static char
>> >> *get_associated_object_from_dml_adaptive_relations
>> >> void
>> >> check_object_relationship_list(char *name, bool is_func_name)
>> >> {
>> >> - if (pool_config->disable_load_balance_on_write ==
>> >> DLBOW_DML_ADAPTIVE &&
>> >> pool_config->parsed_dml_adaptive_object_relationship_list)
>> >> + bool is_adaptive;
>> >> +
>> >> + is_adaptive = DLBOW_IS_DML_ADAPTIVE(
>> >> +
>> >> pool_config->disable_load_balance_on_write);
>> >>
>> >> I wrote in the commit message:
>> >>
>> >> modifications are only detected in the same transaction). Note,
>> >> however, you cannot use dml_adaptive_object_relationship_list to track
>> >> dependency among table and other objects.
>> >>
>> >> In my understanding the feature does not use
>> >> dml_adaptive_object_relationship_list. If this is correct, why
>> >> check_object_relationship_list() is called here in case
>> >> dml_adaptive_global? If the feature uses
>> >> dml_adaptive_object_relationship_list, test cases should be included.
>> >>
>> >> diff --git a/src/utils/pool_track_table_mutation.c
>> >> b/src/utils/pool_track_table_mutation.c
>> >> new file mode 100644
>> >> index 000000000..9be46b28f
>> >> --- /dev/null
>> >> +++ b/src/utils/pool_track_table_mutation.c
>> >>
>> >> It seems following functions are not used anywhere. I wonder if this
>> >> feature actually use "query parse cache".
>> >>
>> >> pool_track_table_mutation_get_cached_parse
>> >> pool_track_table_mutation_cache_parse
>> >> pool_track_table_mutation_normalize_and_hash
>> >>
>> >> Besides the code review, I mutated one of regression tests to check
>> >> whether the feature co exists with in the existing memory query cache
>> >> feature. After attached patch applied, I ran 006.memqcache and got the
>> >> following result.
>> >>
>> >> cd src/test/regression
>> >> ./regress.sh 006
>> >> creating pgpool-II temporary installation ...
>> >> moving pgpool_setup to temporary installation path ...
>> >> moving watchdog_setup to temporary installation path ...
>> >> using pgpool-II at
>> >>
>> /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/temp/installed
>> >> *************************
>> >> REGRESSION MODE : install
>> >> Pgpool-II version : pgpool-II version 4.8devel (mitsukakeboshi)
>> >> Pgpool-II install path :
>> >>
>> /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/temp/installed
>> >> PostgreSQL bin : /usr/local/pgsql/bin
>> >> PostgreSQL Major version : 18
>> >> pgbench : /usr/local/pgsql/bin/pgbench
>> >> PostgreSQL jdbc :
>> >> /usr/local/pgsql/share/postgresql-9.2-1003.jdbc4.jar
>> >> *************************
>> >> testing 006.memqcache...failed.
>> >> out of 1 ok:0 failed:1 timeout:0
>> >>
>> >> log/006.memqcache shows:
>> >>
>> >> ../expected.txt result.txt differ: char 1, line 1
>> >>
>> >> So I checked the test script and found the error was generated by a
>> >> Java program test.
>> >>
>> >> java jdbctest > result.txt 2>&1
>> >> cmp ../expected.txt result.txt
>> >> if [ $? != 0 ];then
>> >> ./shutdownall
>> >> exit 1
>> >> fi
>> >>
>> >> In jdbctest.java:
>> >>
>> >> /*
>> >> * Cache test in an explicit transaction
>> >> */
>> >> conn.setAutoCommit(false);
>> >> // execute DML. This should prevent SELECTs from using
>> >> query cache in the transaction.
>> >> sql = "UPDATE t1 SET i = 2;";
>> >> pst = conn.createStatement();
>> >> pst.executeUpdate(sql);
>> >> pst.close();
>> >> // should not use the cache and should return "2",
>> rather
>> >> than "1"
>> >> prest = conn.prepareStatement("SELECT * FROM t1");
>> >> rs = prest.executeQuery();
>> >>
>> >> The expected file (expected.txt) has "2" but the result file
>> >> (testdir/result.txt) was "1". This is the reason why the test
>> >> failed. I wonder if there's something wrong with the feature when the
>> >> query cache is enabled. Can you look into this?
>> >>
>> >> Regards,
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS K.K.
>> >> English: http://www.sraoss.co.jp/index_en/
>> >> Japanese:http://www.sraoss.co.jp
>> >>
>> >
>> >
>> > --
>> > Nadav Shatz
>> > Tailor Brands | CTO
>>
>
>
> --
> Nadav Shatz
> Tailor Brands | CTO

In response to

Responses

Browse pgpool-hackers by date

  From Date Subject
Next Message Nadav Shatz 2026-04-23 14:16:24 Re: Proposal: Recent mutated table tracking in memory
Previous Message Nadav Shatz 2026-04-19 14:29:42 Re: Proposal: Recent mutated table tracking in memory