Re: Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)
Date: 2016-09-11 18:05:07
Message-ID: CAM3SWZTW+3vmZugxhy=_jdm5YxNXGdAEHv5Sz6jBpdyfsJVkRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 11, 2016 at 6:28 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Pushed this "displace root" patch, with some changes:

Attached is rebased version of the entire patch series, which should
be applied on top of what you pushed to the master branch today.

This features a new scheme for managing workMem --
maintenance_work_mem is now treated as a high watermark/budget for the
entire CREATE INDEX operation, regardless of the number of workers.
This seems to work much better, so Robert was right to suggest it.

There were also improvements to the cost model, to weigh available
maintenance_work_mem under this new system. And, the cost model was
moved inside planner.c (next to plan_cluster_use_sort()), which is
really where it belongs. The cost model is still WIP, though, and I
didn't address some concerns of my own about how tuplesort.c
coordinates workers. I think that Robert's "condition variables" will
end up superseding that stuff anyway. And, I think that this v2 will
bitrot fairly soon, when Heikki commits what is in effect his version
of my 0002-* patch (that's unchanged, if only because it refactors
some things that the parallel CREATE INDEX patch is reliant on).

So, while there are still a few loose ends with this revision (it
should still certainly be considered WIP), I wanted to get a revision
out quickly because V1 has been left to bitrot for too long now, and
my schedule is very full for the next week, ahead of my leaving to go
on vacation (which is long overdue). Hopefully, I'll be able to get
out a third revision next Saturday, on top of the
by-then-presumably-committed new tape batch memory patch from Heikki,
just before I leave. I'd rather leave with a patch available that can
be cleanly applied, to make review as easy as possible, since it
wouldn't be great to have this V2 with bitrot for 10 days or more.

--
Peter Geoghegan

Attachment Content-Type Size
0003-Rearrange-header-file-include-directives.patch.gz application/x-gzip 1.4 KB
0005-Add-force_btree_randomaccess-GUC-for-testing.patch.gz application/x-gzip 1.7 KB
0001-Cap-the-number-of-tapes-used-by-external-sorts.patch.gz application/x-gzip 1.8 KB
0002-Use-tuplesort-batch-memory-for-randomAccess-sorts.patch.gz application/x-gzip 8.6 KB
0004-Add-parallel-B-tree-index-build-sorting.patch.gz application/x-gzip 56.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-11 18:19:05 Re: Install extensions using update scripts (was Re: Remove superuser() checks from pgstattuple)
Previous Message Kevin Grittner 2016-09-11 17:34:49 Re: [REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...