Re: REINDEX CONCURRENTLY 2.0

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY 2.0
Date: 2014-11-14 08:04:00
Message-ID: 5465B770.70604@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/13/14, 3:50 PM, Andres Freund wrote:
> On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On 11/12/14 7:31 PM, Andres Freund wrote:
>>> Yes, it sucks. But it beats not being able to reindex a relation with
>> a
>>> primary key (referenced by a fkey) without waiting several hours by a
>>> couple magnitudes. And that's the current situation.
>>
>> That's fine, but we have, for better or worse, defined CONCURRENTLY :=
>> does not take exclusive locks. Use a different adverb for an
>> in-between
>> facility.
>
> I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get around to the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs between the actions.

It also means that if we ever found a way to get rid of the exclusive lock we'd then have an inconsistency anyway. Or we'd also create REINDEX CONCURRENT at that time, and then have 2 command syntaxes to support.

> I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder to see the consequences of that.

Having been responsible for a site where downtime was a 6 figure dollar amount per hour, I've spent a LOT of time worrying about lock problems. The really big issue here isn't grabbing an exclusive lock; it's grabbing one at some random time when no one is there to actively monitor what's happening. (If you can't handle *any* exclusive locks, that also means you can never do an ALTER TABLE ADD COLUMN either.) With that in mind, would it be possible to set this up so that the time-consuming process of building the new index file happens first, and then (optionally) some sort of DBA action is required to actually do the relfilenode swap? I realize that's not the most elegant solution, but it's WAY better than this feature not hitting 9.5 and people having to hand-code a solution.

Possible syntax:
REINDEX CONCURRENTLY -- Does what current patch does
REINDEX CONCURRENT BUILD -- Builds new files
REINDEX CONCURRENT SWAP -- Swaps new files in

This suffers from the syntax problems I mentioned above, but at least this way it's all limited to one command, and it probably allows a lot more people to use it.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-11-14 08:17:24 Re: Change in HEAP_NEWPAGE logging makes diagnosis harder
Previous Message furuyao 2014-11-14 08:00:05 Customized Options Threshold Error