Re: Support for REINDEX CONCURRENTLY

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2012-10-05 00:33:56
Message-ID: CAB7nPqTLQfQc8VqwQPhaSy1fPxJKBU3O_9o_9uhLWNx7Jn3y6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 5, 2012 at 6:58 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On Thursday, October 04, 2012 04:51:29 AM Tom Lane wrote:
> I can understand hesitation around that.. I would like to make sure I
> understand the problem correctly. When we get to the point where we switch
> indexes we should be in the following state:
> - both indexes are indisready
> - old should be invalid
> - new index should be valid
> - have the same indcheckxmin
> - be locked by us preventing anybody else from making changes
>
Looks like a good presentation of the problem. I am not sure if marking the
new index as valid is necessary though. As long as it is done inside the
same transaction as the swap there are no problems, no?

> Lets assume we have index a_old(relfilenode 1) as the old index and a
> rebuilt
> index a_new (relfilenode 2) as the one we just built. If we do it properly
> nobody will have 'a' open for querying, just for modifications (its
> indisready)
> as we had waited for everyone that could have seen a as valid to finish.
>
> As far as I understand the code a session using a_new will also have built
> a
> relcache entry for a_old.
> Two problems:
> * relying on the relcache to be built for both indexes seems hinky
> * As the relcache is built with SnapshotNow it could read the old
> definition
> for a_new and the new one for a_old (or the reverse) and thus end up with
> both
> pointing to the same relfilenode. Which would be ungood.
>
OK, so the problem here is that the relcache, as the syscache, are relying
on SnapshotNow which cannot be used safely as the false index definition
could be read by other backends. So this looks to bring back the discussion
to the point where a higher lock level is necessary to perform a safe
switch of the indexes.

I assume that the switch phase is not the longest phase of the concurrent
operation, as you also need to build and validate the new index at prior
steps. I am just wondering if it is acceptable to you guys to take a
stronger lock only during this switch phase. This won't make the reindex
being concurrently all the time but it would avoid any visibility issues
and have an index switch processing which is more consistent with the
existing implementation as it could rely on the same mechanism as normal
reindex that switches relfilenode.
--
Michael Paquier
http://michael.otacoo.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-10-05 02:55:41 Re: pgsql: Disable _FORTIFY_SOURCE with ICC
Previous Message Andres Freund 2012-10-04 21:58:16 Re: Support for REINDEX CONCURRENTLY