Re: Index corruption with CREATE INDEX CONCURRENTLY

From: Keith Fiske <keith(at)omniti(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index corruption with CREATE INDEX CONCURRENTLY
Date: 2017-02-17 15:39:05
Message-ID: CAG1_KcC8EYd0M00WXtTVmAF495NKbu+1e0TZn=RBF2KXAKcy9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 6, 2017 at 10:17 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Mon, Feb 6, 2017 at 10:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> >> Hmm. Consider that the first time relcahe invalidation occurs while
> >> computing id_attrs, so now the retry logic will compute the correct
> >> set of attrs (considering two indexes, if we take the example given by
> >> Alvaro above.). However, the attrs computed for hot_* and key_* will
> >> be using only one index, so this will lead to a situation where two of
> >> the attrs (hot_attrs and key_attrs) are computed with one index and
> >> id_attrs is computed with two indexes. I think this can lead to
> >> Assertion in HeapSatisfiesHOTandKeyUpdate().
> >
> > That seems like something we'd be best off to fix by changing the
> > assertion.
> >
>
> The above-mentioned problem doesn't exist in your version of the patch
> (which is now committed) because the index attrs are cached after
> invalidation and I have mentioned the same in my yesterday's followup
> mail. The guarantee of safety is that we don't handle invalidation
> between two consecutive calls to RelationGetIndexAttrBitmap() in
> heap_update, but if we do handle due to any reason which is not
> apparent to me, then I think there is some chance of hitting the
> assertion.
>
> > I doubt it's really going to be practical to guarantee that
> > those bitmapsets are always 100% consistent throughout a transaction.
> >
>
> Agreed. As the code stands, I think it is only expected to be
> guaranteed across three consecutive calls to
> RelationGetIndexAttrBitmap() in heap_update. Now, if the assertion in
> HeapSatisfiesHOTandKeyUpdate() is useless and we can remove it, then
> probably we don't need a guarantee to be consistent in three
> consecutive calls as well.
>
> >> Okay, please find attached patch which is an extension of Tom's and
> >> Andres's patch and I think it fixes the above problem noted by me.
> >
> > I don't like this patch at all. It only fixes the above issue if the
> > relcache inval arrives while RelationGetIndexAttrBitmap is executing.
> > If the inval arrives between two such calls, you still have the problem
> > of the second one delivering a bitmapset that might be inconsistent
> > with the first one.
> >
>
> I think it won't happen between two consecutive calls to
> RelationGetIndexAttrBitmap in heap_update. It can happen during
> multi-row update, but that should be fine. I think this version of
> patch only defers the creation of fresh bitmapsets where ever
> possible.
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Was just curious if anyone was able to come up with any sort of method to
test whether an index was corrupted by this bug, other than just waiting
for bad query results? We've used concurrent index rebuilding quite
extensively over the years to remove bloat from busy systems, but
reindexing the entire database "just in case" is unrealistic in many of our
cases.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-02-17 15:45:35 Re: Instability in select_parallel regression test
Previous Message Alvaro Herrera 2017-02-17 15:19:56 Re: pg_recvlogical.c doesn't build with --disable-integer-datetimes