Re: Add parallelism and glibc dependent only options to reindexdb

From: Julien Rouhaud <julien(dot)rouhaud(at)free(dot)fr>
To: Sergei Kornilov <sk(at)zsrv(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add parallelism and glibc dependent only options to reindexdb
Date: 2019-07-25 08:42:11
Message-ID: CAOBaU_YDTym=Er78KCrf4HK_DKC3OgeSwiCqdBdP=h1nWD1=7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review!

On Thu, Jul 25, 2019 at 10:17 AM Sergei Kornilov <sk(at)zsrv(dot)org> wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world: tested, passed
> Implements feature: tested, failed
> Spec compliant: not tested
> Documentation: tested, passed
>
> Hi
>
> I did some review and have few notes about behavior.
>
> reindex database does not work with concurrently option:
>
> > ./inst/bin/reindexdb --echo -d postgres -j 8 --concurrently
> > SELECT pg_catalog.set_config('search_path', '', false);
> > REINDEX SYSTEM CONCURRENTLY postgres;
> > reindexdb: error: reindexing of system catalogs on database "postgres" failed: ERROR: cannot reindex system catalogs concurrently
>
> I think we need print message and skip system catalogs for concurrently reindex.
> Or we can disallow concurrently database reindex with multiple jobs. I prefer first option.

Good point. I agree with 1st option, as that's already what would
happen without the --jobs switch:

$ reindexdb -d postgres --concurrently
WARNING: cannot reindex system catalogs concurrently, skipping all

(although this is emitted by the backend)
I modified the client code to behave the same and added a regression test.

> > + reindex_one_database(dbname, REINDEX_SCHEMA, &schemas, host,
> > + port, username, prompt_password, progname,
> > + echo, verbose, concurrently,
> > + Min(concurrentCons, nsp_count));
>
> Should be just concurrentCons instead of Min(concurrentCons, nsp_count)

Indeed, that changed with v8 and I forgot to update it, fixed.

> reindex_one_database for REINDEX_SCHEMA will build tables list and then processing by available workers. So:
> -j 8 -S public -S public -S public -S poblic -S public -S public - will work with 6 jobs (and without multiple processing for same table)
> -j 8 -S public - will have only one worker regardless tables count
>
> > if (concurrentCons > FD_SETSIZE - 1)
>
> "if (concurrentCons >= FD_SETSIZE)" would not cleaner? Well, pgbench uses >= condition, vacuumdb uses > FD_SETSIZE - 1. No more FD_SETSIZE in conditions =)

I don't have a strong opinion here. If we change for >=, it'd be
better to also adapt vacuumdb for consistency. I didn't change it for
now, to stay consistent with vacuumdb.

Attachment Content-Type Size
reindex_parallel_v9.diff application/octet-stream 19.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-07-25 08:45:00 Re: dropdb --force
Previous Message Jehan-Guillaume de Rorthais 2019-07-25 08:29:44 Re: pg_receivewal documentation