Re: REINDEX CONCURRENTLY 2.0

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY 2.0
Date: 2017-02-17 12:53:24
Message-ID: 1da61300-31e8-d416-1d41-56c15cd4753d@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/14/2017 04:56 AM, Michael Paquier wrote:
> On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>> On 02/13/2017 06:31 AM, Michael Paquier wrote:
>>> Er, something like that as well, no?
>>> DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
>>
>> REINDEX (VERBOSE) currently prints one such line per index, which does not
>> really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
>> on a relation at the same time. It is not immediately obvious how this
>> should work. Maybe one such detail line per table?
>
> Hard to recall this thing in details with the time and the fact that a
> relation is reindexed by processing all the indexes once at each step.
> Hm... What if ReindexRelationConcurrently() actually is refactored in
> such a way that it processes all the steps for each index
> individually? This way you can monitor the time it takes to build
> completely each index, including its . This operation would consume
> more transactions but in the event of a failure the amount of things
> to clean up is really reduced particularly for relations with many
> indexes. This would as well reduce VERBOSE to print one line per index
> rebuilt.

I am actually thinking about going the opposite direction (by reducing
the number of times we call WaitForLockers), because it is not just
about consuming transaction IDs, we also do not want to wait too many
times for transactions to commit. I am leaning towards only calling
WaitForLockersMultiple three times per table.

1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for
lockers once per index. This seems rather wasteful.

I have thought about that the code might be cleaner if we just looped
over all indexes (and as a bonus the VERBOSE output would be more
obvious), but I do not think it is worth waiting for lockers all those
extra times.

Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-02-17 12:53:25 Re: Instability in select_parallel regression test
Previous Message Rushabh Lathia 2017-02-17 12:52:14 Re: Gather Merge