Re: pgsql: Support parallel btree index builds.

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Support parallel btree index builds.
Date: 2021-06-07 23:11:53
Message-ID: 202106072311.6zryr7i2xuhx@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2018-Feb-02, Robert Haas wrote:

> Support parallel btree index builds.

While looking at a complaint related to progress report of parallel
index builds[1], I noticed this comment

+ /*
+ * Execute this worker's part of the sort.
+ *
+ * Unlike leader and serial cases, we cannot avoid calling
+ * tuplesort_performsort() for spool2 if it ends up containing no dead
+ * tuples (this is disallowed for workers by tuplesort).
+ */
+ tuplesort_performsort(btspool->sortstate);
+ if (btspool2)
+ tuplesort_performsort(btspool2->sortstate);

I've been trying to understand why this says "Unlike leader and serial
cases, ...". I understand the "serial" part -- it refers to
_bt_leafbuild. So I'm to understand that that one works differently;
see below. But why does it say "the leader case"? As far as I can see,
the leader executes exactly the same code, so what is the comment
talking about?

Now, if you do look at _bt_leafbuild(), it can be seen that nothing is
done differently there either; we're not actually skipping any calls to
tuplesort_performsort(). Any differentiation between serial/leader/
worker cases seems to be done inside that routine. So the comment is
not very useful there either.

I am wondering if these comments are leftovers from early development
versions of this patch. Maybe we could remove them -- or rewrite them
to indicate not that we avoid calling tuplesort_performsort(), but
instead to say that that function behaves differently.


Álvaro Herrera 39°49'30"S 73°17'W
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"

In response to


Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2021-06-08 00:05:32 pgsql: Reorder superuser check in pg_log_backend_memory_contexts()
Previous Message Peter Eisentraut 2021-06-07 19:37:57 pgsql: Add _outTidRangePath()

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2021-06-07 23:44:29 Re: A test for replay of regression tests
Previous Message Tom Lane 2021-06-07 23:10:00 Re: CALL versus procedures with output-only arguments