Re: Add parallelism and glibc dependent only options to reindexdb

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: Add parallelism and glibc dependent only options to reindexdb
Date: 2019-07-12 09:47:52
Message-ID: CAOBaU_Y8Ts+TUX4Uf+5jqUKeZ-3zdxZQF9ZStr+x8gDmrxTZzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 12, 2019 at 7:57 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Fri, Jul 12, 2019 at 07:49:13AM +0200, Julien Rouhaud wrote:
> > It shouldn't be a problem, I reused the same infrastructure as for
> > vacuumdb. so run_reindex_command has a new "async" parameter, so when
> > there's no parallelism it's using executeMaintenanceCommand (instead
> > of PQsendQuery) which will block until query completion. That's why
> > there's no isFree usage at all in this case.
>
> My point is more about consistency and simplification with the case
> where n > 1 and that we could actually move the async/sync code paths
> into the same banner as the async mode waits as well until a slot is
> free, or in short when the query completes.

I attach v4 with all previous comment addressed.

I also changed to handle parallel and non-parallel case the same way.
I kept the possibility for synchronous behavior in reindexdb, as
there's an early need to run some queries in case of parallel
database-wide reindex. It avoids to open all the connections in case
anything fails during this preliminary work, and it also avoids
another call for the async wait function. If we add parallelism to
clusterdb (I'll probably work on that next time I have spare time),
reindexdb would be the only caller left of
executeMaintenanceCommand(), so that's something we may want to
change.

I didn't change the behavior wrt. possible deadlock if user specify
catalog objects using --index or --table and ask for multiple
connection, as I'm afraid that it'll add too much code for a little
benefit. Please shout if you think otherwise.

Attachment Content-Type Size
0001-Export-vacuumdb-s-parallel-infrastructure-v4.patch application/octet-stream 23.6 KB
0002-Add-parallel-processing-to-reindexdb-v4.patch application/octet-stream 17.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Eugen Konkov 2019-07-12 10:04:27 Request for improvement: Allow to push (t.o).id via GROUP BY ocd.o
Previous Message Ideriha, Takeshi 2019-07-12 09:46:15 RE: Copy data to DSA area