| From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
|---|---|
| To: | Cary Huang <cary(dot)huang(at)highgo(dot)ca> |
| Cc: | Junwang Zhao <zhjwpku(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Support tid range scan in parallel? |
| Date: | 2025-11-18 01:51:59 |
| Message-ID: | CAApHDvrcwsw_=Y_8PhOjHzGRUtN5bXzL_yS_5vqKrCuaqtdkyw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, 7 Nov 2025 at 18:31, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I still need to do a bit more testing on this, but in the meantime
> thought I'd share what I've done with it so that other people can look
> in parallel.
I've been looking at the v11 patch again. I did some testing and can't break it.
I noted down a couple of things:
1. table_parallelscan_initialize() is called first in a parallel TID
Range Scan which calls table_block_parallelscan_initialize() and may
set phs_syncscan to true. We directly then call
table_beginscan_parallel_tidrange(), which sets phs_syncscan = false
unconditionally. No bugs, but it is a little strange. One way to get
around this weirdness would be to move the responsibility of setting
phs_syncscan into table_parallelscan_initialize() and then use
table_beginscan_parallel_tidrange() to set phs_syncscan = false. I
wasn't overly concerned about this, so I didn't do that. I just wanted
to mention it here as someone else might think it's worth making
better.
2. I've made it so each worker calls TidRangeEval() to figure out the
TID range to scan. The first worker to get the lock in
table_block_parallelscan_startblock_init() gets to set the range of
blocks to scan for all workers. In the planner, the suitability of the
TID Range quals are checked with IsBinaryTidClause(), which allows
OpExprs with the ctid column compared to a Var-less expression that
contains no volatile functions. If someone coded up a parallel safe
volatile function and marked it as stable, each worker could end up
with different ctid values and one worker would win the race to set
the blocks to scan based on the TID values it got, which wouldn't
really suit the other workers and the tid values they ended up with.
I'm thinking of someone marks the volatile function as stable, then
we're entitled to having things behave strangely for them. To make
this right, I'd have to make it so only 1 worker evaluates the TID
expressions and then sets those somehow for the other workers. There'd
have to be some additional shared memory for that, and I don't think
the complexity of making that work is worthwhile. Mentioning it as
someone else might feel differently.
I've attached v12, which adds a mention in the docs about Parallel TID
Range scans being supported. It also does very minor adjustments to
the comments. Again, I've kept Cary's v10 and the changes I've made
separate. Of course, I'd squash these before commit.
Does anyone have any opinions on #1 or #2 or want to look at this? I'd
like to get this in soon.
David
| Attachment | Content-Type | Size |
|---|---|---|
| v12-0001-v10-parallel-tid-range-scan.patch | application/octet-stream | 25.9 KB |
| v12-0002-fixup-v10-parallel-tid-range-scan.patch | application/octet-stream | 26.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2025-11-18 01:57:33 | Re: [PATCH] Add pg_get_subscription_ddl() function |
| Previous Message | wenhui qiu | 2025-11-18 01:47:45 | Re: POC: make mxidoff 64 bits |