| From: | Nadav Shatz <nadav(at)tailorbrands(dot)com> |
|---|---|
| To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Cc: | pgpool-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Proposal: Recent mutated table tracking in memory |
| Date: | 2026-04-19 14:29:42 |
| Message-ID: | CACeKOO1-FUo1E=1iUeQeua7XrVzYVoy2FawQM4d3SG-X5f2mdQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgpool-hackers |
Hi Tatsuo,
Rebased onto current master, renumbered the regression tests
(043/044 to avoid collision with the new 042.ssl_reload), and
combined everything into a single commit.
Attached: v2-0001-Feature-load-balancing-control-by-table-tracking.patch
Looking forward to your review.
On Sun, Apr 19, 2026 at 10:25 AM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
> > Hi Tatsuo,
> >
> > hank you for the detailed review. Attached patch addresses all items.
>
> I guess the attached patch is on top of
> v1-0001-Feature-load-balancing-control-by-table-tracking.patch. To
> apply v2-0001-address-review.patch, we need to apply
> v1-0001-Feature-load-balancing-control-by-table-tracking.patch first.
> Unfortunately due to recent commit, it does not apply anymore. Can you
> please provide v1 + v2 that are rebased against latest master branch?
> Also 042 regression test is already used by recent commit. Can you
> renumber 042.track_table_mutation and
> 043.track_table_mutation_watchdog to 043.track_table_mutation and
> 044.track_table_mutation_watchdog accordingly?
>
> Looking forward to seeing new patch.
>
> Regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
>
> > memqcache bug fix
> > -----------------
> >
> > Good catch. The root cause: pool_set_writing_transaction() was
> > explicitly skipping dml_adaptive_global, so
> > pool_is_writing_transaction() always returned false in this mode.
> > The query cache fetch guard at pool_proto_modules.c:270
> > (!pool_is_writing_transaction()) then served stale cached results
> > after DML in the same transaction.
> >
> > Fix: pool_set_writing_transaction() now sets the flag for
> > dml_adaptive_global (only 'off' and 'dml_adaptive' skip it). This
> > ensures the query cache is properly bypassed after writes within
> > the same transaction.
> >
> > Removed dead query parse cache code (~700 lines)
> > -------------------------------------------------
> >
> > You're right -- pool_track_table_mutation_get_cached_parse,
> > pool_track_table_mutation_cache_parse, and
> > pool_track_table_mutation_normalize_and_hash were never called.
> > These were leftover from an earlier design where we planned to
> > cache SQL parse results in shared memory. The feature ended up
> > using pgpool's existing parser directly, and this code was never
> > wired up.
> >
> > Removed: QueryParseCache and QueryParseEntry structs, all related
> > static functions, the TRACK_TABLE_MUTATION_QUERY_SEM semaphore,
> > and the track_table_mutation_query_buckets /
> > track_table_mutation_query_parse_cache_size configuration
> > parameters. This also reduces shared memory usage from ~6.4 MB
> > to ~80 KB with default settings.
> >
> > check_object_relationship_list scope
> > -------------------------------------
> >
> > You're correct -- dml_adaptive_global does not use
> > dml_adaptive_object_relationship_list. Changed
> > check_object_relationship_list() to check for DLBOW_DML_ADAPTIVE
> > only, not DLBOW_IS_DML_ADAPTIVE (which includes global).
> >
> > Documentation fixes
> > -------------------
> >
> > - Removed "(Lagless Replica Reads)" from section title and
> > "lagless" language from description.
> >
> > - Described fallback behavior when neither
> > replication_delay_source_cmd nor delay_threshold_by_time is
> > configured (TTL stays at 100ms default minimum).
> >
> > - "query cache" references removed (the query parse cache is gone).
> >
> > - Added 128-table-per-SELECT limit to Limitations section
> > (uses POOL_MAX_SELECT_OIDS).
> >
> > Code style fixes
> > ----------------
> >
> > - DLBOW_IS_DML_ADAPTIVE() calls no longer split across lines.
> >
> > - Split the long errmsg line in
> > is_select_object_in_temp_write_list.
> >
> > - Removed redundant is_adaptive variable in
> > is_select_object_in_temp_write_list (the check at function
> > entry already guarantees it).
> >
> > Thanks!
> >
> > On Wed, Apr 15, 2026 at 1:43 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
> wrote:
> >
> >> Hi Nadav,
> >>
> >> > Hi Tatsuo,
> >> >
> >> > Looks good to me thanks!
> >> >
> >> > Please go ahead with your review. waiting to hear back from you.
> >>
> >> Here are the code review results.
> >>
> >> diff --git a/doc/src/sgml/loadbalance.sgml
> b/doc/src/sgml/loadbalance.sgml
> >> index 9e1e7b39b..7384ce81a 100644
> >> --- a/doc/src/sgml/loadbalance.sgml
> >> +++ b/doc/src/sgml/loadbalance.sgml
> >> :
> >> + <sect2 id="runtime-config-table-mutation-map">
> >> + <title>Table Mutation Map Configuration (Lagless Replica
> Reads)</title>
> >>
> >> "(Lagless Replica Reads)" sounds like an advertisement to me. It
> >> should be removed.
> >>
> >> + <para>
> >> + These parameters configure the track table mutation feature, which
> is
> >> activated by setting
> >> + <xref linkend="guc-disable-load-balance-on-write"> to
> >> <literal>dml_adaptive_global</literal>.
> >> + The feature tracks recently written tables to prevent stale reads
> from
> >> replica nodes during
> >> + replication lag, implementing the "lagless" architecture pattern for
> >> distributed systems
> >> + with read replicas.
> >>
> >> I think the feature does not guarantee "lagless" anytime, in all cases.
> >>
> >> + <para>
> >> + This feature requires time-based replication delay monitoring. This
> >> can be provided by either
> >> + <xref linkend="guc-replication-delay-source-cmd"> (external command
> >> mode) or by setting
> >> + <xref linkend="guc-delay-threshold-by-time"> (which uses
> >> <literal>pg_stat_replication.replay_lag</literal>
> >> + from PostgreSQL 10+). At least one of these must be configured for
> the
> >> TTL calculation to work.
> >>
> >> If one of these is not set, what happens? Error? Need to describe it.
> >>
> >> + </para>
> >> +
> >> + <warning>
> >> + <para>
> >> + Enabling <literal>dml_adaptive_global</literal> increases shared
> >> memory consumption. With default settings,
> >> + the feature requires approximately 6.4 MB of shared memory (0.1 MB
> >> for table tracking + 6.3 MB for query cache).
> >>
> >> "query cache" should be "query parse cache".
> >>
> >> + Memory usage scales with configuration parameters:
> >> + </para>
> >> + <itemizedlist>
> >> + <listitem>
> >> + <para>
> >> + Table tracking: <literal>track_table_mutation_table_size * 40
> >> bytes</literal> (default: 2048 * 40 = ~80 KB)
> >> + </para>
> >> + </listitem>
> >> + <listitem>
> >> + <para>
> >> + Query cache:
> <literal>track_table_mutation_query_parse_cache_size *
> >> 640 bytes</literal> (default: 10000 * 640 = ~6.3 MB)
> >>
> >> "query cache" should be "query parse cache".
> >>
> >> + <title>Limitations</title>
> >>
> >> I think number of tables tacked in a SELECT is limited to 8. It should
> >> be mentioned.
> >>
> >> diff --git a/src/context/pool_query_context.c
> >> b/src/context/pool_query_context.c
> >> index a056ac596..0190d3673 100644
> >> --- a/src/context/pool_query_context.c
> >> +++ b/src/context/pool_query_context.c
> >> @@ -1828,15 +1829,23 @@ is_in_list(char *name, List *list)
> >> static bool
> >> is_select_object_in_temp_write_list(Node *node, void *context)
> >> {
> >> - if (node == NULL || pool_config->disable_load_balance_on_write
> !=
> >> DLBOW_DML_ADAPTIVE)
> >> + if (node == NULL ||
> >> + !DLBOW_IS_DML_ADAPTIVE(
> >> +
> >> pool_config->disable_load_balance_on_write))
> >>
> >> You don't need to split the line.
> >>
> >> + is_adaptive = DLBOW_IS_DML_ADAPTIVE(
> >> +
> >> pool_config->disable_load_balance_on_write);
> >>
> >> You don't need to split the line.
> >>
> >> - if (pool_config->disable_load_balance_on_write ==
> >> DLBOW_DML_ADAPTIVE && session_context->is_in_transaction)
> >> + if (is_adaptive &&
> >> + session_context->is_in_transaction)
> >> {
> >> ereport(DEBUG1,
> >>
> >> (errmsg("is_select_object_in_temp_write_list: \"%s\", found relation
> >> \"%s\"", (char *) context, rgv->relname)));
> >> This line is too long. Please split.
> >>
> >> @@ -1880,7 +1889,13 @@ static char
> >> *get_associated_object_from_dml_adaptive_relations
> >> void
> >> check_object_relationship_list(char *name, bool is_func_name)
> >> {
> >> - if (pool_config->disable_load_balance_on_write ==
> >> DLBOW_DML_ADAPTIVE &&
> >> pool_config->parsed_dml_adaptive_object_relationship_list)
> >> + bool is_adaptive;
> >> +
> >> + is_adaptive = DLBOW_IS_DML_ADAPTIVE(
> >> +
> >> pool_config->disable_load_balance_on_write);
> >>
> >> I wrote in the commit message:
> >>
> >> modifications are only detected in the same transaction). Note,
> >> however, you cannot use dml_adaptive_object_relationship_list to track
> >> dependency among table and other objects.
> >>
> >> In my understanding the feature does not use
> >> dml_adaptive_object_relationship_list. If this is correct, why
> >> check_object_relationship_list() is called here in case
> >> dml_adaptive_global? If the feature uses
> >> dml_adaptive_object_relationship_list, test cases should be included.
> >>
> >> diff --git a/src/utils/pool_track_table_mutation.c
> >> b/src/utils/pool_track_table_mutation.c
> >> new file mode 100644
> >> index 000000000..9be46b28f
> >> --- /dev/null
> >> +++ b/src/utils/pool_track_table_mutation.c
> >>
> >> It seems following functions are not used anywhere. I wonder if this
> >> feature actually use "query parse cache".
> >>
> >> pool_track_table_mutation_get_cached_parse
> >> pool_track_table_mutation_cache_parse
> >> pool_track_table_mutation_normalize_and_hash
> >>
> >> Besides the code review, I mutated one of regression tests to check
> >> whether the feature co exists with in the existing memory query cache
> >> feature. After attached patch applied, I ran 006.memqcache and got the
> >> following result.
> >>
> >> cd src/test/regression
> >> ./regress.sh 006
> >> creating pgpool-II temporary installation ...
> >> moving pgpool_setup to temporary installation path ...
> >> moving watchdog_setup to temporary installation path ...
> >> using pgpool-II at
> >>
> /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/temp/installed
> >> *************************
> >> REGRESSION MODE : install
> >> Pgpool-II version : pgpool-II version 4.8devel (mitsukakeboshi)
> >> Pgpool-II install path :
> >>
> /home/t-ishii/work/Pgpool-II/current/pgpool2/src/test/regression/temp/installed
> >> PostgreSQL bin : /usr/local/pgsql/bin
> >> PostgreSQL Major version : 18
> >> pgbench : /usr/local/pgsql/bin/pgbench
> >> PostgreSQL jdbc :
> >> /usr/local/pgsql/share/postgresql-9.2-1003.jdbc4.jar
> >> *************************
> >> testing 006.memqcache...failed.
> >> out of 1 ok:0 failed:1 timeout:0
> >>
> >> log/006.memqcache shows:
> >>
> >> ../expected.txt result.txt differ: char 1, line 1
> >>
> >> So I checked the test script and found the error was generated by a
> >> Java program test.
> >>
> >> java jdbctest > result.txt 2>&1
> >> cmp ../expected.txt result.txt
> >> if [ $? != 0 ];then
> >> ./shutdownall
> >> exit 1
> >> fi
> >>
> >> In jdbctest.java:
> >>
> >> /*
> >> * Cache test in an explicit transaction
> >> */
> >> conn.setAutoCommit(false);
> >> // execute DML. This should prevent SELECTs from using
> >> query cache in the transaction.
> >> sql = "UPDATE t1 SET i = 2;";
> >> pst = conn.createStatement();
> >> pst.executeUpdate(sql);
> >> pst.close();
> >> // should not use the cache and should return "2",
> rather
> >> than "1"
> >> prest = conn.prepareStatement("SELECT * FROM t1");
> >> rs = prest.executeQuery();
> >>
> >> The expected file (expected.txt) has "2" but the result file
> >> (testdir/result.txt) was "1". This is the reason why the test
> >> failed. I wonder if there's something wrong with the feature when the
> >> query cache is enabled. Can you look into this?
> >>
> >> Regards,
> >> --
> >> Tatsuo Ishii
> >> SRA OSS K.K.
> >> English: http://www.sraoss.co.jp/index_en/
> >> Japanese:http://www.sraoss.co.jp
> >>
> >
> >
> > --
> > Nadav Shatz
> > Tailor Brands | CTO
>
--
Nadav Shatz
Tailor Brands | CTO
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Feature-load-balancing-control-by-table-tracking.patch | application/octet-stream | 89.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2026-04-23 08:14:08 | Re: Proposal: Recent mutated table tracking in memory |
| Previous Message | Tatsuo Ishii | 2026-04-19 07:24:44 | Re: Proposal: Recent mutated table tracking in memory |