Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>, jian(dot)universality(at)gmail(dot)com
Cc: zsolt(dot)parragi(at)percona(dot)com, er(at)xs4all(dot)nl, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-06-22 07:07:43
Message-ID: CAAAe_zDG=5kJa=snRXuwARvVS1SEvLyiV84igPZx4nYu62EWNA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tatsuo, Jian,

Please find attached (coverage.tgz) the code coverage analysis for the RPR
branch.

* Measurement setup
- Target: PostgreSQL RPR branch, modified-lines basis (RPR-base..RPR diff)
- Build: gcc --enable-coverage with --with-llvm (LLVM JIT module included;
both C and C++ instrumented)
- Tests: make check-world (regression + TAP + contrib; no forced JIT
settings)

* Results
- Modified-line coverage: 2,608 / 2,702 (96.5%)
- Functions: 172 / 173 (99.4%)
- Breakdown of the 94 uncovered lines:
- Reachable (coverable by tests): 28 lines
- Unreachable (defensive / dead code): 66 lines
- worth cleaning up (Assert / remove / coverage-exclude): 22 lines
- best kept as idiomatic guards (enum default, pg_unreachable(),
public windowapi.h relpos guards): 44 lines

* Projected coverage (modified-lines basis)
- Current: 96.5% (2,608 /
2,702)
- After adding the proposed tests (+28 reachable): 97.6% (2,636 /
2,702)
- After also cleaning up only the lines worth changing
(-22; the 44 idiomatic guards are kept on purpose): 98.4% (2,636 /
2,680)
- If every unreachable line were removed (not advised
for the idiomatic guards): ~100% (no residual
uncovered lines)

* Picking a target -- your input would help
The analysis points to three possible levels:
(a) tests only -> 97.6% (add tests for the 28 reachable
lines)
(b) tests + safe cleanups -> 98.4% (also Assert/remove the 22
worth-fixing lines;
keep the 44 idiomatic guards as-is)
(c) full cleanup -> ~100% (also rework the idiomatic guards
-- usually undesirable)
My own inclination sits somewhere between (a) and (b): add all the
reachable-line tests, and
clean up only the most clear-cut lines (e.g. the arithmetic-underflow
Assert conversions and
the dead _equalRPRPattern body), while deciding the remaining defensive
lines case by case
rather than touching all 22 at once. I'd value your view on how far to take
this.

One caveat: this reachable/unreachable classification was produced with AI
assistance, so it
may be wrong in places. When I actually write the test cases I will
scrutinize each item
individually and verify it against a coverage build before proposing
anything.

* What the report contains
Each uncovered line carries a collapsible box stating:
- Reachable / Unreachable classification + confidence
- Reachable: the concrete SQL test that covers it
- Unreachable: the reason it cannot execute, and the recommended source
change
Consecutive lines on the same straight-line flow (no branch in between) are
merged into one box.

* How to view
tar xzf coverage.tgz
# open coverage/index.html in a browser
# -> pick a file -> expand the collapsible box under each red
(uncovered) line

* Contents
- coverage/index.html : per-file coverage overview
- coverage/html/ : per-file detail (source + uncovered-line analysis)
- coverage/untested.md : checklist of uncovered lines

Please review. Feel free to reply with any questions.

Best regards,
Henson

Attachment Content-Type Size
coverage.tgz application/x-gzip 911.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2026-06-22 07:19:59 Fix handling of copy_file_range() return value
Previous Message Hayato Kuroda (Fujitsu) 2026-06-22 06:56:45 RE: doc: should pg_createsubscriber be grouped as a client application?