Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date: 2018-01-02 09:38:14
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On Sun, Dec 31, 2017 at 9:59 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Tue, Dec 12, 2017 at 2:09 AM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> >> I now believe that index_create() should reject catalog parallel
> >> CREATE INDEX directly, just as it does for catalog CREATE INDEX
> >> CONCURRENTLY. That logic should be generic to all AMs, since the
> >> reasons for disallowing catalog parallel index builds are generic.
> >>
> >
> > Sorry I didn't get this, reject means? you mean it should throw an error
> > catalog parallel CREATE INDEX? or just suggesting to set the
> > ParallelWorkers and may be LeaderAsWorker from index_create()
> > or may be index_build()?
> I mean that we should be careful to make sure that AM-generic parallel
> CREATE INDEX logic does not end up in a specific AM (nbtree).
Ah okay, that's what I thought.

> The patch *already* refuses to perform a parallel CREATE INDEX on a
> system catalog, which is what I meant by reject (sorry for being
> unclear). The point is that that's due to a restriction that has
> nothing to do with nbtree in particular (just like the CIC restriction
> on catalogs), so it should be performed within index_build(). Just
> like the similar CONCURRENTLY-on-a-catalog restriction, though without
> throwing an error, since of course the user doesn't explicitly ask for
> a parallel CREATE INDEX at any point (unlike CONCURRENTLY).
> Once we go this way, the cost model has to be called at that point,
> too. We already have the AM-specific "OldIndex->rd_rel->relam ==
> BTREE_AM_OID" tests within cluster.c, even though theoretically
> another AM might be involved with CLUSTER in the future, which this
> seems similar to.
> So, I propose the following (this is a rough outline):
> * Add new IndexInfo files after ii_Concurrent/ii_BrokenHotChain --
> ii_ParallelWorkers and ii_LeaderAsWorker.
> * Call plan_create_index_workers() within index_create(), assigning to
> ii_ParallelWorkers, and fill in ii_LeaderAsWorker from the
> parallel_leader_participation GUC. Add comments along the lines of
> "only nbtree supports parallel builds". Test the index with a
> "heapRelation->rd_rel->relam == BTREE_AM_OID" to make this work.
> Otherwise, assign zero to ii_ParallelWorkers (and leave
> ii_LeaderAsWorker as false).
> * For builds on catalogs, or builds using other AMs, don't let
> parallelism go ahead by immediately assigning zero to
> ii_ParallelWorkers within index_create(), near where the similar CIC
> test occurs already.
> What do you think of that?

Need to do after the indexRelation build. So I added after update of
as indexRelation needed for plan_create_index_worders().

Attaching the separate patch the same.

> >> Do you think that the main part of the cost model needs to care about
> >> parallel_leader_participation, too?
> >>
> >> compute_parallel_worker() assumes that the caller is planning a
> >> parallel-sequential-scan-alike thing, in the sense that the leader
> >> only acts like a worker in cases that probably don't have many
> >> workers, where the leader cannot keep itself busy as a leader. That's
> >> actually quite different to parallel CREATE INDEX, because the
> >> leader-as-worker state will behave in exactly the same way as a worker
> >> would, no matter how many workers there are. The leader process is
> >> guaranteed to give its full attention to being a worker, because it
> >> has precisely nothing else to do until workers finish. This makes me
> >> think that we may need to immediately do something with the result of
> >> compute_parallel_worker(), to consider whether or not a
> >> leader-as-worker state should be used, despite the fact that no
> >> existing compute_parallel_worker() caller does anything like this.
> >>
> >
> > I agree with you. compute_parallel_worker() mainly design for the
> > scan-alike things. Where as parallel create index is different in a
> > sense where leader has as much power as worker. But at the same
> > time I don't see any side effect or negative of that with PARALLEL
> > CREATE INDEX. So I am more towards not changing that aleast
> > for now - as part of this patch.
> I've also noticed is that there is little to no negative effect on
> CREATE INDEX duration from adding new workers past the point where
> adding more workers stops making the build faster. It's quite clear.
> And, in general, there isn't all that much theoretical justification
> for the cost model (it's essentially the same as any other parallel
> scan), which doesn't seem to matter much. So, I agree that it doesn't
> really matter in practice, but disagree that it should not still be
> changed -- the justification may be a little thin, but I think that we
> need to stick to it. There should be a theoretical justification for
> the cost model that is coherent in the wider context of costs models
> for parallelism in general. It should not be arbitrarily inconsistent
> just because it apparently doesn't matter that much. It's easy to fix
> -- let's just fix it.

So you suggesting that need to do adjustment with the output of
compute_parallel_worker() by considering parallel_leader_participation?

Rushabh Lathia

Attachment Content-Type Size
0004-Calculate-parallel-worker-inside-index_create.patch text/x-patch 5.2 KB
0003-Adjust-planner-code-to-consider-parallel_leader_part.patch text/x-patch 4.6 KB
0002-Obtain-a-RowExclusiveLock-on-the-index-relation-with.patch text/x-patch 1.1 KB
0001-Add-parallel-tuple-sort-for-CREATE-INDEX.patch text/x-patch 150.4 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-01-02 09:48:24 Re: [HACKERS] pgbench randomness initialization
Previous Message Andrey Borodin 2018-01-02 09:33:57 Re: [HACKERS] wrong t_bits alignment in pageinspect