Re: How to estimate the shared memory size required for parallel scan?

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Masayuki Takahashi <masayuki038(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to estimate the shared memory size required for parallel scan?
Date: 2018-08-18 13:40:05
Message-ID: CAEepm=2STNSpgE8e8tJ5ohNMrQwiqDiXrrOJjzvzY4j+yDZ0QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 19, 2018 at 12:01 AM, Masayuki Takahashi
<masayuki038(at)gmail(dot)com> wrote:
>> It's up to you to design a struct to hold whatever data,
> spinlocks, LWLocks, atomics etc you might need to orchestrate your
> parallel scan.
>
> If FDW(ex. cstore_fdw) does not need to share some information among
> workers more than PostgreSQL core in parallel scan, does it not need
> to allocate DSM?

Right. You don't have to supply InitializeDSMForeignScan,
ReInitializeWorkerForeignScan, InitializeWorkerForeignScan functions.
If you just supply an IsForeignScanParallelSafe function that returns
true, that would allow your FDW to be used inside parallel workers and
wouldn't need any extra shared memory, but it wouldn't be a "parallel
scan". It would just be "parallel safe". Each process that does a
scan of your FDW would expect a full normal scan (presumably returning
the same tuples in each process). That means it can be used, for
example, on the inner side of a join, where the outer side comes from
a parallel scan. Like file_fdw can.

A true parallel scan of an FDW would be one where each process emits
an arbitrary fraction of the tuples, but together they emit all of the
tuples. You'd almost certainly need to use some shared memory to
coordinate that. To say that you support that, I think your
GetForeignPaths() function would need to call add_partial_path(). And
unless I'm mistaken, whether or not InitializeDSMForeignScan etc are
called might be the only indication you get of whether you need to run
in parallel-aware mode. I haven't personally heard of any FDWs that
can do this yet, but I just tried hacking file_fdw to register a
partial path and it seems to work (though of course the results are
duplicated because the emitted tuples are not actually partial).

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-08-18 14:07:54 Re: How to estimate the shared memory size required for parallel scan?
Previous Message Masayuki Takahashi 2018-08-18 12:01:53 Re: How to estimate the shared memory size required for parallel scan?