pgsql: Add parallelism support for TID Range Scans

From: David Rowley <drowley(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Add parallelism support for TID Range Scans
Date: 2025-11-27 01:05:25
Message-ID: E1vOQRo-001a6K-1k@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Add parallelism support for TID Range Scans

In v14, bb437f995 added support for scanning for ranges of TIDs using a
dedicated executor node for the purpose. Here, we allow these scans to
be parallelized. The range of blocks to scan is divvied up similarly to
how a Parallel Seq Scans does that, where 'chunks' of blocks are
allocated to each worker and the size of those chunks is slowly reduced
down to 1 block per worker by the time we're nearing the end of the
scan. Doing that means workers finish at roughly the same time.

Allowing TID Range Scans to be parallelized removes the dilemma from the
planner as to whether a Parallel Seq Scan will cost less than a
non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan
(disk costs are not divided by the number of workers). It was possible
the planner could choose the Parallel Seq Scan which would result in
reading additional blocks during execution than the TID Scan would have.
Allowing Parallel TID Range Scans removes the trade-off the planner
makes when choosing between reduced CPU costs due to parallelism vs
additional I/O from the Parallel Seq Scan due to it scanning blocks from
outside of the required TID range. There is also, of course, the
traditional parallelism performance benefits to be gained as well, which
likely doesn't need to be explained here.

Author: Cary Huang <cary(dot)huang(at)highgo(dot)ca>
Author: David Rowley <dgrowleyml(at)gmail(dot)com>
Reviewed-by: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Reviewed-by: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
Reviewed-by: Steven Niu <niushiji(at)gmail(dot)com>
Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/0ca3b16973a8bb1c185f56e65edcadc0d9d2c406

Modified Files
--------------
doc/src/sgml/parallel.sgml | 9 ++
src/backend/access/heap/heapam.c | 4 +-
src/backend/access/table/tableam.c | 148 +++++++++++++++++++++--------
src/backend/executor/execParallel.c | 21 ++++
src/backend/executor/nodeTidrangescan.c | 80 ++++++++++++++++
src/backend/optimizer/path/costsize.c | 34 +++++--
src/backend/optimizer/path/tidpath.c | 24 ++++-
src/backend/optimizer/util/pathnode.c | 7 +-
src/include/access/relscan.h | 2 +
src/include/access/tableam.h | 14 ++-
src/include/executor/nodeTidrangescan.h | 7 ++
src/include/nodes/execnodes.h | 2 +
src/include/optimizer/pathnode.h | 3 +-
src/test/regress/expected/tidrangescan.out | 105 ++++++++++++++++++++
src/test/regress/sql/tidrangescan.sql | 44 +++++++++
15 files changed, 446 insertions(+), 58 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Amit Langote 2025-11-27 03:12:19 pgsql: Fix error reporting for SQL/JSON path type mismatches
Previous Message David Rowley 2025-11-26 21:44:05 pgsql: Have the planner replace COUNT(ANY) with COUNT(*), when possible