Re: WIP: [[Parallel] Shared] Hash

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: [[Parallel] Shared] Hash
Date: 2017-01-11 19:20:36
Message-ID: CAM3SWZRx87vdJ1fW=rbYsPdFE6-dSX2qEYBFCP8DYRqrJA2R1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 11, 2017 at 10:57 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jan 10, 2017 at 8:56 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> Instead of all this, I suggest copying some of my changes to fd.c, so
>> that resource ownership within fd.c differentiates between a vfd that
>> is owned by the backend in the conventional sense, including having a
>> need to delete at eoxact, as well as a lesser form of ownership where
>> deletion should not happen.
>
> If multiple processes are using the same file via the BufFile
> interface, I think that it is absolutely necessary that there should
> be a provision to track the "attach count" of the BufFile. Each
> process that reaches EOXact decrements the attach count and when it
> reaches 0, the process that reduced it to 0 removes the BufFile. I
> think anything that's based on the notion that leaders will remove
> files and workers won't is going to be fragile and limiting, and I am
> going to push hard against any such proposal.

Okay. My BufFile unification approach happens to assume that backends
clean up after themselves, but that isn't a ridged assumption (of
course, these are always temp files, so we reason about them as temp
files). It could be based on a refcount fairly easily, such that, as
you say here, deletion of files occurs within workers (that "own" the
files) only as a consequence of their being the last backend with a
reference, that must therefore "turn out the lights" (delete the
file). That seems consistent with what I've done within fd.c, and what
I suggested to Thomas (that he more or less follow that approach).
You'd probably still want to throw an error when workers ended up not
deleting BufFile segments they owned, though, at least for parallel
tuplesort.

This idea is something that's much more limited than the
SharedTemporaryFile() API that you sketched on the parallel sort
thread, because it only concerns resource management, and not how to
make access to the shared file concurrency safe in any special,
standard way. I think that this resource management is something that
should be managed by buffile.c (and the temp file routines within fd.c
that are morally owned by buffile.c, their only caller). It shouldn't
be necessary for a client of this new infrastructure, such as parallel
tuplesort or parallel hash join, to know anything about file paths.
Instead, they should be passing around some kind of minimal
private-to-buffile state in shared memory that coordinates backends
participating in BufFile unification. Private state created by
buffile.c, and passed back to buffile.c. Everything should be
encapsulated within buffile.c, IMV, making parallel implementations as
close as possible to their serial implementations.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2017-01-11 19:30:47 Re: Packages: Again
Previous Message Robert Haas 2017-01-11 19:09:54 Re: An isolation test for SERIALIZABLE READ ONLY DEFERRABLE