| From: | Cary Huang <cary(dot)huang(at)highgo(dot)ca> |
|---|---|
| To: | "David Rowley" <dgrowleyml(at)gmail(dot)com> |
| Cc: | "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Junwang Zhao" <zhjwpku(at)gmail(dot)com>, "Rafia Sabih" <rafia(dot)pghackers(at)gmail(dot)com> |
| Subject: | Re: Support tid range scan in parallel? |
| Date: | 2025-12-05 20:12:50 |
| Message-ID: | 19af025560c.285806d9196801.7033912717900263131@highgo.ca |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi David,
Thanks a lot for the detailed review, the fixes, and for pushing this
forward. I really appreciate the time you spent going through the
patch set
> 1. Moved block limiting logic for parallel scans into
> table_block_parallelscan_startblock_init(). There's currently a lock
> here to ensure only 1 worker can set the shared memory fields at a
> time. I've hooked into the same lock to set the startblock and
> numblocks.
I agree this is a much better location than heap_setscanlimits() to
ensure only the leader can set parallel scan limits for all workers.
You are right, the condition I put in heap_setscanlimits() cannot
guarentee the leader would get there first. Thanks for pointing
this out.
> 1. table_parallelscan_initialize() is called first in a parallel TID
> Range Scan which calls table_block_parallelscan_initialize() and may
> set phs_syncscan to true. We directly then call
> table_beginscan_parallel_tidrange(), which sets phs_syncscan = false
> unconditionally. No bugs, but it is a little strange. One way to get
> around this weirdness would be to move the responsibility of setting
> phs_syncscan into table_parallelscan_initialize() and then use
> table_beginscan_parallel_tidrange() to set phs_syncscan = false. I
> wasn't overly concerned about this, so I didn't do that. I just wanted
> to mention it here as someone else might think it's worth making
> better.
Nice catch. Yes it is a bit weird but harmless. I agree to leave it as is
now.
> within each worker process.
> </para>
> </listitem>
> + <listitem>
> + <para>
> + In a <emphasis>parallel tid range scan</emphasis>, the range of blocks
> + will be subdivided into smaller ranges which are shared among the
> + cooperating processes. Each worker process will complete the scanning
> + of its given range of blocks before requesting an additional range of
> + blocks.
> + </para>
> + </listitem>
> </itemizedlist>
I may be missing some info or wrong but my impression is that
the range of blocks is actually set by the leader worker and is
the same among all the cooperating workers rather than
subdivided. The workers fetches as many blocks to process as
they can (similar to sequential scan in parllel) as long as the
block falls within the TID range. Current block number is
stored in parallel scan descriptor in shared memory so workers
will not fetch the same block during scan.
Thanks again for all the help and improvements!
Cary Huang
www.highgo.ca
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Corey Huinker | 2025-12-05 20:34:12 | Re: vacuumdb: add --dry-run |
| Previous Message | Kirill Reshke | 2025-12-05 20:10:36 | Re: vacuumdb: add --dry-run |