Re: Index corruption with CREATE INDEX CONCURRENTLY

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index corruption with CREATE INDEX CONCURRENTLY
Date: 2017-02-06 02:38:01
Message-ID: CAA4eK1JoPKOgAO+aq6XEmgGi+3vj64_d_RkeNj9JfOYEC=J0mA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 6, 2017 at 6:27 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> On 2017-02-05 16:37:33 -0800, Andres Freund wrote:
>> > RelationGetIndexList(Relation relation)
>> > @@ -4746,8 +4747,10 @@ RelationGetIndexPredicate(Relation relat
>> > * we can include system attributes (e.g., OID) in the bitmap representation.
>> > *
>> > * Caller had better hold at least RowExclusiveLock on the target relation
>> > - * to ensure that it has a stable set of indexes. This also makes it safe
>> > - * (deadlock-free) for us to take locks on the relation's indexes.
>> > + * to ensure it is safe (deadlock-free) for us to take locks on the relation's
>> > + * indexes. Note that since the introduction of CREATE INDEX CONCURRENTLY,
>> > + * that lock level doesn't guarantee a stable set of indexes, so we have to
>> > + * be prepared to retry here in case of a change in the set of indexes.
>>
>> I've not yet read the full thread, but I'm a bit confused so far. We
>> obviously can get changing information about indexes here, but isn't
>> that something we have to deal with anyway? If we guarantee that we
>> don't loose knowledge that there's a pending invalidation, why do we
>> have to retry? Pretty much by definition the knowledge in a relcache
>> entry can be outdated as soon as returned unless locking prevents that
>> from being possible - which is not the case here.
>>
>> ISTM it'd be better not to have retry logic here, but to follow the more
>> general pattern of making sure that we know whether the information
>> needs to be recomputed at the next access. We could either do that by
>> having an additional bit of information about the validity of the
>> bitmaps (so we have invalid, building, valid - and only set valid at the
>> end of computing the bitmaps when still building and not invalid again),
>> or we simply move the bitmap computation into the normal relcache build.
>
> To show what I mean here's an *unpolished* and *barely tested* patch
> implementing the first of my suggestions.
>

+ * Signal that the attribute bitmaps are being built. If there's any
+ * relcache invalidations while building them, rd_bitmapsvalid will be
+ * reset to 0. In that case return the bitmaps, but don't mark them as
+ * valid.
+ */

I don't see in your patch that you are setting rd_bitmapsvalid to 0.
Also, I think this suffers from the same problem as the patch proposed
by Alvaro which is that different attrs (hot_attrs, key_attrs and
id_attrs) will get computed based on different index lists which can
cause a problem as mentioned in my mail up thread. However, I think
your general approach and idea that we have to set the things up for
next time is on spot. I am not sure but I think your solution can be
extended to make it somewhat similar to what we do with rd_indexvalid
(basically, we should set rd_bitmapsvalid to 2 (bitmap is temporarily
forced) at rel cache invalidation and then clean it up transaction
end). I can try something on those lines.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2017-02-06 02:42:34 Re: Index corruption with CREATE INDEX CONCURRENTLY
Previous Message Pavan Deolasee 2017-02-06 02:31:57 Re: Index corruption with CREATE INDEX CONCURRENTLY