Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>, jian(dot)universality(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com
Subject: Re: Row pattern recognition
Date: 2026-06-23 14:28:39
Message-ID: CAAAe_zDR8XqOfYL1ay5o6Y2OogJNed3oR+aG=Hm1Ja=t=v6=bw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

As a final memory-safety check before wrapping up the RPR patch, I ran the
branch under Valgrind. The short version: no memory errors were found.

Setup, on Linux/AMD64 (Valgrind is not usable on macOS):

configure: --enable-debug --enable-cassert -DUSE_VALGRIND
also defined: -DCOPY_PARSE_PLAN_TREES -DWRITE_READ_PARSE_PLAN_TREES
-DRAW_EXPRESSION_COVERAGE_TEST
valgrind: --suppressions=src/tools/valgrind.supp --trace-children=yes
--track-origins=yes --leak-check=no --error-exitcode=128

The round-trip defines keep the parse/plan-tree copy and write/read paths
on by
default, so _outRPRPattern, _readRPRPattern and _copyRPRPattern are
exercised on
every plan tree, not just at parse time.

I ran installcheck against this instance in three phases.

Phase A, round-trips on: rpr, rpr_nfa, rpr_explain and rpr_integration.
These
are normal-size patterns, so the node copy/write-read paths are verified
along
the way. All passed.

Phase B, round-trips off: rpr_base, which includes the large 32767-element
pattern. Serializing a tree that big under Valgrind is impractically slow,
and
the serialization path is already covered by the normal-size patterns in
Phase A, so I disabled debug_copy_parse_plan_trees and
debug_write_read_parse_plan_trees here to exercise the NFA build and
scanRPRPattern's large arrays without the serialization bottleneck. Passed.

Phase C: window, preceded by test_setup, create_index and sanity_check so
the
table, indexes and stats are in place. This one reported a regression diff:
a
few EXPLAIN plans came out with Seq Scan / Hash Join where the expected
output
has Index Only Scan / Nested Loop or Merge Join. That is a planner artifact
of
running window outside the full parallel schedule (a different stats/cost
environment), not anything RPR changed; the same queries match expectations
under a normal check-world run. I included window mainly to exercise the
WindowAgg/RPR interaction under instrumentation, and for that the diff does
not
matter, because the memory verdict comes from the Valgrind logs rather than
the
regression output.

Results: no VALGRINDERROR markers in any per-pid log, and every log is empty
(with --quiet, a clean run leaves 0-byte logs), window included. The
postmaster
exited 0, so --error-exitcode=128 never fired, and the tests ran tens of
times
slower than usual, which confirms the code really did run under
instrumentation.
So across parse, NFA build, pattern scan, EXPLAIN and node serialization, no
invalid read/write, uninitialised-value use, out-of-bounds access or invalid
free was observed.

The usual caveat applies: this is dynamic analysis, so it only attests to
the
paths the tests actually exercised. As I add the coverage tests discussed in
the other thread, I will re-run the same setup over the expanded suite.

Best regards,
Henson

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2026-06-23 14:29:41 Re: File locks for data directory lockfile in the context of Linux namespaces
Previous Message Henson Choi 2026-06-23 14:28:18 Re: Row pattern recognition