Re: Support tid range scan in parallel?

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Cary Huang <cary(dot)huang(at)highgo(dot)ca>
Cc: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support tid range scan in parallel?
Date: 2025-06-08 10:52:47
Message-ID: CAEG8a3LAcNNUceNfCOK+PsyLbZ8+40qHgD1F1116VZs+2pVnZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Cary,

On Fri, Jun 6, 2025 at 5:24 AM Cary Huang <cary(dot)huang(at)highgo(dot)ca> wrote:
>
> Hello
>
> Sorry David and all who have reviewed the patch, it's been awhile since the patch
> was last worked on :(. Thank you all for the reviews and comments! Attached is
> the rebased patch that adds support for parallel TID range scans. This feature is
> particularly useful scanning large tables where the data needs to be scanned in
> sizable segments using a TID range in the WHERE clause to define each segment.
> By enabling parallelism, this approach can improve performance compared to
> both non-parallel TID range scans and traditional sequential scans.
>
> Regards
>
> Cary Huang

Thanks for updating the patch. I have a few comments on it.

+ /*
+ * if parallel mode is used, store startblock and numblocks in parallel
+ * scan descriptor as well.
+ */
+ if (scan->rs_base.rs_parallel != NULL)
+ {
+ ParallelBlockTableScanDesc bpscan = NULL;
+
+ bpscan = (ParallelBlockTableScanDesc) scan->rs_base.rs_parallel;
+ bpscan->phs_startblock = scan->rs_startblock;
+ bpscan->phs_numblock = scan->rs_numblocks;
+ }

It would be more intuitive and clearer to directly use startBlk and numBlks
to set these values. Since scan->rs_startblock and scan->rs_numblocks
are already set using these variables, using the same approach for bpscan
would make the code easier to understand.

Another nitty-gritty is that you might want to use a capital `If` in the
comments to maintain the same style.

+ if (nallocated >= pbscan->phs_nblocks || (pbscan->phs_numblock != 0 &&
+ nallocated >= pbscan->phs_numblock))

I'd suggest explictly setting phs_numblock to InvalidBlockNumber in
table_block_parallelscan_initialize, and compare with InvalidBlockNumber
here.

--
Regards
Junwang Zhao

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseniy Mukhin 2025-06-08 12:39:22 Re: amcheck support for BRIN indexes
Previous Message Junwang Zhao 2025-06-08 09:41:17 Add a bound check to TidRangeEval