Re: Parallel tuplesort (for parallel B-Tree index creation)

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)
Date: 2017-11-03 01:20:20
Message-ID: CAEepm=1HzYBiJOZ-8ruM7BqJYX362Va+rcn3YacPCM3QLF=K=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 1, 2017 at 2:11 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Tue, Oct 31, 2017 at 5:07 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> Another complaint is that perhaps fd.c
>> knows too much about buffile.c's business. For example,
>> RemovePgTempFilesInDir() knows about the ".set" directories created by
>> buffile.c, which might be called a layering violation. Perhaps the
>> set/directory logic should move entirely into fd.c, so you'd call
>> FileSetInit(FileSet *), not BufFileSetInit(BufFileSet *), and then
>> BufFileOpenShared() would take a FileSet *, not a BufFileSet *.
>> Thoughts?
>
> I'm going to make an item on my personal TODO list for that. No useful
> insights on that right now, though.

I decided to try that, but it didn't really work: fd.h gets included
by front-end code, so I can't very well define a struct and declare
functions that deal in dsm_segment and slock_t. On the other hand it
does seem a bit better to for these shared file sets to work in terms
of File, not BufFile. That way you don't have to opt in to BufFile's
double buffering and segmentation schemes just to get shared file
clean-up, if for some reason you want direct file handles. So I in
the v24 parallel hash patch set I just posted over in the other thread
I have moved it into its own translation unit sharedfileset.c and made
it work with File objects. buffile.c knows how to use it as a source
of segment files. I think that's better.

> If the new standard is that you have temp file names that suggest the
> purpose of each temp file, then that may be something that parallel
> CREATE INDEX should buy into.

Yeah, I guess that could be useful.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-11-03 01:24:43 Re: Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Amit Kapila 2017-11-03 01:10:55 Re: Setting pd_lower in GIN metapage