Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date: 2018-01-12 13:19:23
Message-ID: CAA4eK1+izMyxzFD6k81Deyar35YJ5qdpbRTUp9cQvo+niQom7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 6, 2018 at 3:47 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Tue, Jan 2, 2018 at 8:43 PM, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com> wrote:
>> I agree that plan_create_index_workers() needs to count the leader as a
>> normal worker for the CREATE INDEX. So what you proposing is - when
>> parallel_leader_participation is true launch (return value of
>> compute_parallel_worker() - 1)
>> workers. true ?
>
> Almost. We need to not subtract one when only one worker is indicated
> by compute_parallel_worker(). I also added some new stuff there, to
> consider edge cases with the parallel_leader_participation GUC.
>
>>> I'm working on fixing up what you posted. I'm probably not more than a
>>> week away from posting a patch that I'm going to mark "ready for
>>> committer". I've already made the change above, and once I spend time
>>> on trying to break the few small changes needed within buffile.c I'll
>>> have taken it as far as I can, most likely.
>>>
>>
>> Okay, once you submit the patch with changes - I will do one round of
>> review for the changes.
>
> I've attached my revision. Changes include:
>

Few observations while skimming through the patch:

1.
+ if (!IsBootstrapProcessingMode() && !indexInfo->ii_Concurrent)
{
- snapshot = RegisterSnapshot(GetTransactionSnapshot());
- OldestXmin = InvalidTransactionId; /* not used */
+ OldestXmin = GetOldestXmin(heapRelation, true);

I think leader and workers should have the same idea of oldestXmin for
the purpose of deciding the visibility of tuples. I think this is
ensured in all form of parallel query as we do share the snapshot,
however, same doesn't seem to be true for Parallel Index builds.

2.
+
+ /* Wait on worker processes to finish (should be almost instant) */
+ reltuples = _bt_leader_wait_for_workers(buildstate);

Can't we use WaitForParallelWorkersToFinish for this purpose? The
reason is that if we use a different mechanism here then we might need
a different way to solve the problem related to fork failure. See
thread [1]. Basically, what if postmaster fails to launch workers due
to fork failure, the leader backend might wait indefinitely.

[1] - https://commitfest.postgresql.org/16/1341/

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marina Polyakova 2018-01-12 14:10:10 Re: WIP Patch: Pgbench Serialization and deadlock errors
Previous Message Arthur Zakirov 2018-01-12 12:58:49 Re: [HACKERS] Bug in to_timestamp().