Re: Support tid range scan in parallel?

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Cary Huang <cary(dot)huang(at)highgo(dot)ca>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support tid range scan in parallel?
Date: 2025-07-29 05:11:19
Message-ID: CAApHDvoyxYkH5eg_RVxfz+mVt+eWgHvB25xy-z=Xs3nLhU+RLQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 25 Jul 2025 at 06:46, Cary Huang <cary(dot)huang(at)highgo(dot)ca> wrote:
> Thank you so much for the review! I have addressed the comments in the
> attached v7 patch.

I've now spent quite a bit of time going over this patch and testing
it. One issue I found was in heap_set_tidrange() where you were not
correctly setting the scan limits for the "if
(ItemPointerCompare(&highestItem, &lowestItem) < 0)" case. Through a
bit of manually overwriting the planner's choice using the debugger, I
could get the executor to read an entire table despite the tid range
being completely empty. I likely could have got this to misbehave
without the debugger if I'd used PREPAREd statements and made the
ctids parameters to that. It's just the planner didn't choose a
parallel plan with an empty TID range due to the costs being too low.

For the record, here's the unpatched output below:

# explain (analyze) select count(*) from parallel_tidrangescan where
ctid >= '(10,0)' and ctid <= '(9,10)';
Aggregate (cost=2.00..2.01 rows=1 width=8) (actual
time=18440.132..18440.174 rows=1.00 loops=1)
Buffers: shared read=40
-> Gather (cost=0.01..2.00 rows=1 width=0) (actual
time=18440.126..18440.166 rows=0.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=40
-> Parallel Tid Range Scan on parallel_tidrangescan
(cost=0.01..2.00 rows=1 width=0) (actual time=2414.495..2414.495
rows=0.00 loops=3)
TID Cond: ((ctid >= '(10,0)'::tid) AND (ctid <= '(9,10)'::tid))
Buffers: shared read=40

Note the "Buffers: shared read=40", which is this entire table. After
moving the code which sets the ParallelBlockTableScanDesc's limits
into heap_setscanlimits(), I get:

# explain (analyze) select count(*) from parallel_tidrangescan where
ctid >= '(10,0)' and ctid <= '(9,10)';
Aggregate (cost=2.00..2.01 rows=1 width=8) (actual
time=17.787..19.531 rows=1.00 loops=1)
-> Gather (cost=0.01..2.00 rows=1 width=0) (actual
time=17.783..19.527 rows=0.00 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Tid Range Scan on parallel_tidrangescan
(cost=0.01..2.00 rows=1 width=0) (actual time=0.003..0.004 rows=0.00
loops=3)
TID Cond: ((ctid >= '(10,0)'::tid) AND (ctid <= '(9,10)'::tid))

I'm now trying to convince myself that it's safe to adjust the
ParallelBlockTableScanDesc fields in heap_setscanlimits(). These
fields are being adjusted during the call to TidRangeNext() via
table_rescan_tidrange(), which is *during* execution, so there could
be any number of parallel workers doing this concurrently. I'm unsure
at this stage if all those workers want to be using the same scan
limits, either.

Currently, I think the above is a problem and it doesn't quite feel
like committer duty to fix this part. There's a chance I may get more
time, but if I don't, I've attached your v7 patch plus the adjustments
I've made to it so far.

David

Attachment Content-Type Size
v8-0001-v7-parallel-TID-range-scan-patch.patch application/octet-stream 25.2 KB
v8-0002-fixup-v7-parallel-TID-range-scan-patch.patch application/octet-stream 15.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-07-29 05:13:32 Re: [Patch] add new parameter to pg_replication_origin_session_setup
Previous Message Ashutosh Bapat 2025-07-29 05:03:46 Re: SQL Property Graph Queries (SQL/PGQ)