Re: How to estimate the shared memory size required for parallel scan?

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Masayuki Takahashi <masayuki038(at)gmail(dot)com>
Subject: Re: How to estimate the shared memory size required for parallel scan?
Date: 2019-07-08 10:46:59
Message-ID: 325E251F-5135-4026-B0DC-119956571B7D@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 1 Jul 2019, at 13:03, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Fri, May 3, 2019 at 6:06 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>> Added to commitfest.
>
> Rebased.

Below is a review of this patch.

It does what it says on the tin, applies clean without introducing compiler
warnings and it seems like a good addition (for both utility and as an example
implementation). Testing with various parallel worker settings shows that it
properly observes user configuration for parallelism.

The tests pass, but there are also no new tests that could fail. I tried to
construct a test case for this, but it’s fairly hard to make one that can be
added to the repo. Have you given this any thought?

Regarding documentation, it seems reasonable to add a sentence on the file_fdw
page when the user can expect parallel scan.

A few smaller comments on the patch:

In the commit message, I assume you mean “writing *is* probablt some way off”:

"In theory this could be used for other kinds of parallel copying in the
future (but infrastructure for parallel writing it probably some way
off)."

Single-line comments don’t end with a period IIRC (there a few others but only
including the first one here):

+/* The size of the chunks used for parallel scans, in bytes. */

Is there a reason to create partial_path before compute_parallel_worker, since
the latter can make us end up not adding the path at all? Can’t we reverse the
condition on parallel_workers to if (parallel_workers <= 0) and if so skip the
partial path?:

+ partial_path = create_foreignscan_path(root, baserel, NULL,
+ baserel->rows,
+ startup_cost,
+ total_cost,
+ NIL, NULL, NULL, coptions);
+
+ parallel_workers = compute_parallel_worker(baserel,
+ fdw_private->pages, -1,
+ max_parallel_workers_per_gather);
...
+ if (parallel_workers > 0)
+ add_partial_path(baserel, (Path *) partial_path);

Since this is quite general code which can be used by other extensions as well,
maybe we should reword the comment to say so?:

+ /* For use by file_fdw's parallel scans. */

cheers ./daniel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-07-08 10:57:44 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Dave Cramer 2019-07-08 10:41:38 Re: Re: Reviving the "Stopping logical replication protocol" patch from Vladimir Gordichuk