Re: Support for REINDEX CONCURRENTLY

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2013-09-26 20:41:26
Message-ID: CAB7nPqQ2_Lh02yPMb6R1DsK8xavoykLVEZ78=0pAn2BgZWQTTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 26, 2013 at 8:56 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-09-26 20:47:33 +0900, Michael Paquier wrote:
>> On Thu, Sep 26, 2013 at 8:43 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > On 2013-09-26 20:40:40 +0900, Michael Paquier wrote:
>> >> On Thu, Sep 26, 2013 at 7:34 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> >> > On 2013-09-26 12:13:30 +0900, Michael Paquier wrote:
>> >> >> > 2) I don't think the drop algorithm used now is correct. Your
>> >> >> > index_concurrent_set_dead() sets both indisvalid = false and indislive =
>> >> >> > false at the same time. It does so after doing a WaitForVirtualLocks() -
>> >> >> > but that's not sufficient. Between waiting and setting indisvalid =
>> >> >> > false another transaction could start which then would start using that
>> >> >> > index. Which will not get updated anymore by other concurrent backends
>> >> >> > because of inislive = false.
>> >> >> > You really need to follow index_drop's lead here and first unset
>> >> >> > indisvalid then wait till nobody can use the index for querying anymore
>> >> >> > and only then unset indislive.
>> >> >
>> >> >> Sorry, I do not follow you here. index_concurrent_set_dead calls
>> >> >> index_set_state_flags that sets indislive and *indisready* to false,
>> >> >> not indisvalid. The concurrent index never uses indisvalid = true so
>> >> >> it can never be called by another backend for a read query. The drop
>> >> >> algorithm is made to be consistent with DROP INDEX CONCURRENTLY btw.
>> >> >
>> >> > That makes it even worse... You can do the concurrent drop only in the
>> >> > following steps:
>> >> > 1) set indisvalid = false, no future relcache lookups will have it as valid
>> >
>> >> indisvalid is never set to true for the concurrent index. Swap is done
>> >> with concurrent index having indisvalid = false and former index with
>> >> indisvalid = true. The concurrent index is validated with
>> >> index_validate in a transaction before swap transaction.
>> >
>> > Yes. I've described how it *has* to be done, not how it's done.
>> >
>> > The current method of going straight to isready = false for the original
>> > index will result in wrong results because it's not updated anymore
>> > while it's still being used.
>
>> The index being dropped at the end of process is not the former index,
>> but the concurrent index. The index used after REINDEX CONCURRENTLY is
>> the old index but with the new relfilenode.
>
> That's not relevant unless I miss something.
>
> After phase 4 both indexes are valid (although only the old one is
> flagged as such), but due to the switching of the relfilenodes backends
> could have either of both open, depending on the time they built the
> relcache entry. Right?
> Then you go ahead and mark the old index - which still might be used! -
> as dead in phase 5. Which means other backends might (again, depending
> on the time they have built the relcache entry) not update it
> anymore. In read committed we very well might go ahead and use the index
> with the same plan as before, but with a new snapshot. Which now will
> miss entries.
In this case, doing a call to WaitForOldSnapshots after the swap phase
is enough. It was included in past versions of the patch but removed
in the last 2 versions.

Btw, taking the problem from another viewpoint... This feature has now
3 patches, the 2 first patches doing only code refactoring. Could it
be possible to have a look at those ones first? Straight-forward
things should go first, simplifying the core feature evaluation.

Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-26 20:46:27 Re: Support for REINDEX CONCURRENTLY
Previous Message Peter Eisentraut 2013-09-26 20:29:58 Re: gaussian distribution pgbench