Re: Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)
Date: 2016-10-28 00:39:07
Message-ID: CAM3SWZQry9V0EPJ551117XHF_1WK6T6baDqckvY2nU4prQCY=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 19, 2016 at 11:33 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> I don't think that eager merging will prove all that effective,
> however it's implemented. I see a very I/O bound system when parallel
> CREATE INDEX merges serially. There is no obvious reason why you'd
> have a straggler worker process with CREATE INDEX, really.

In an effort to head off any misunderstanding around this patch
series, I started a new Wiki page for it:

https://wiki.postgresql.org/wiki/Parallel_External_Sort

This talks about parallel CREATE INDEX in particular, and uses of
parallel external sort more generally, including future uses beyond
CREATE INDEX.

This approach worked very well for me during the UPSERT project, where
a detailed overview really helped. With UPSERT, it was particularly
difficult to keep the *current* state of things straight, such as
current open items for the patch, areas of disagreement, and areas
where there was no longer any disagreement or controversy. I don't
think that this patch is even remotely as complicated as UPSERT was,
but it's still something that has had several concurrently active
mailing list threads (threads that are at least loosely related to the
project), so I think that this will be useful.

I welcome anyone with an interest in this project to review the Wiki
page, add their own concerns to it with -hackers citation, and add
their own content around related work. There is a kind of unresolved
question around where the Gather Merge work might fit in to what I've
come up with aleady. There may be other unresolved questions like
that, that I'm not even aware of.

I commit to maintaining the new Wiki page as a useful starting
reference for understanding the current state of this patch. I hope
this makes looking into the patch series less intimidating for
potential reviewers.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2016-10-28 01:06:53 Re: Proposal : For Auto-Prewarm.
Previous Message Greg Stark 2016-10-27 23:39:35 Re: emergency outage requiring database restart