Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date: 2018-01-22 20:15:48
Message-ID: CAH2-WzmQ7Bpk3JXGvacPoitCRa4j3p7A9_y+RziEEqiiKoTQqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 22, 2018 at 3:52 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> The difference is that nodeGather.c doesn't have any logic like the
> one you have in _bt_leader_heapscan where the patch waits for each
> worker to increment nparticipantsdone. For Gather node, we do such a
> thing (wait for all workers to finish) by calling
> WaitForParallelWorkersToFinish which will have the capability after
> Robert's patch to detect if any worker is exited abnormally (fork
> failure or failed before attaching to the error queue).

FWIW, I don't think that that's really much of a difference.

ExecParallelFinish() calls WaitForParallelWorkersToFinish(), which is
similar to how _bt_end_parallel() calls
WaitForParallelWorkersToFinish() in the patch. The
_bt_leader_heapscan() condition variable wait for workers that you
refer to is quite a bit like how gather_readnext() behaves. It
generally checks to make sure that all tuple queues are done.
gather_readnext() can wait for developments using WaitLatch(), to make
sure every tuple queue is visited, with all output reliably consumed.

This doesn't look all that similar to _bt_leader_heapscan(), I
suppose, but I think that that's only because it's normal for all
output to become available all at once for nbtsort.c workers. The
startup cost is close to or actually the same as the total cost, as it
*always* is for sort nodes.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ildar Musin 2018-01-22 20:26:31 Re: [HACKERS] Custom compression methods
Previous Message Nikolay Shaplov 2018-01-22 20:14:56 Re: [HACKERS] [PATCH] Tests for reloptions