Re: REINDEX CONCURRENTLY 2.0

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: REINDEX CONCURRENTLY 2.0
Date: 2017-03-08 15:34:55
Message-ID: 025c1e3d-1586-c678-922c-c01e12bb013e@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/08/2017 03:48 AM, Robert Haas wrote:
> On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>> And I would argue that his feature is useful for quite many, based on my
>> experience running a semi-large database. Index bloat happens and without
>> REINDEX CONCURRENTLY it can be really annoying to solve, especially for
>> primary keys. Certainly more people have problems with index bloat than the
>> number of people who store index oids in their database.
>
> Yeah, but that's not the only wart, I think.

The only two potential issues I see with the patch are:

1) That the index oid changes visibly to external users.

2) That the code for moving the dependencies will need to be updated
when adding new things which refer to an index oid.

Given how useful I find REINDEX CONCURRENTLY I think these warts are
worth it given that the impact is quite limited. I am of course biased
since if I did not believe this I would not pursue this solution in the
first place.

> For example, I believe
> (haven't looked at this patch series in a while) that the patch takes
> a lock and later escalates the lock level. If so, that could lead to
> doing a lot of work to build the index and then getting killed by the
> deadlock detector.

This version of the patch no longer does that. For my use case
escalating the lock would make this patch much less interesting. The
highest lock level taken is the same one as the initial one (SHARE
UPDATE EXCLUSIVE). The current patch does on a high level (very
simplified) this:

1. CREATE INDEX CONCURRENTLY ind_new;
2. Atomically move all dependencies from ind to ind_new, rename ind to
ind_old, and rename ind_new to ind.
3. DROP INDEX CONCURRENTLY ind_old;

The actual implementation is a bit more complicated in reality, but no
part escalates the lock level over what would be required by the steps
for creating and dropping indexes concurrently

> Also, if by any chance you think (or use any
> software that thinks) that OIDs for system objects are a stable
> identifier, this will be the first case where that ceases to be true.
> If the system is shut down or crashes or the session is killed, you'll
> be left with stray objects with names that you've never typed into the
> system. I'm sure you're going to say "don't worry, none of that is
> any big deal" and maybe you're right.

Hm, I cannot think of any real life scenario where this will be an issue
based on my personal experience with PostgreSQL, but if you can think of
one please provide it. I will try to ponder some more on this myself.

Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-03-08 15:49:33 Re: Explicit subtransactions for PL/Tcl
Previous Message Mark Dilger 2017-03-08 15:16:59 Re: Hash support for grouping sets