Re: Index corruption with CREATE INDEX CONCURRENTLY

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index corruption with CREATE INDEX CONCURRENTLY
Date: 2017-02-06 03:05:58
Message-ID: CABOikdP88gE9Ekrw0f3c5tMqnMDgp5qp583mSyFxpxk1=6at9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 6, 2017 at 8:15 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Mon, Feb 6, 2017 at 8:01 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
> wrote:
> >
> >
> > On Mon, Feb 6, 2017 at 1:44 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>
> >>
> >>
> >> > 2. In the second patch, we tried to recompute attribute lists if a
> >> > relcache
> >> > flush happens in between and index list is invalidated. We've seen
> >> > problems
> >> > with that, especially it getting into an infinite loop with
> >> > CACHE_CLOBBER_ALWAYS. Not clear why it happens, but apparently
> relcache
> >> > flushes keep happening.
> >>
> >> Well, yeah: the point of CLOBBER_CACHE_ALWAYS is to make a relcache
> flush
> >> happen whenever it possibly could.
> >
> >
> > Ah, ok. That explains why the retry logic as originally proposed was
> causing
> > infinite loop.
> >
> >>
> >> The way to deal with that without
> >> looping is to test whether the index set *actually* changed, not whether
> >> it just might have changed.
> >
> >
> > I like this approach. I'll run the patch on a build with
> > CACHE_CLOBBER_ALWAYS, but I'm pretty sure it will be ok. I also like the
> > fact that retry logic is now limited to only when index set changes,
> which
> > fits in the situation we're dealing with.
> >
>
> Don't you think it also has the same problem as mentioned by me in
> Alvaro's patch and you also agreed on it?

No, I don't think so. There can't be any more relcache flushes occurring
between the first RelationGetIndexAttrBitmap() and the subsequent
RelationGetIndexAttrBitmap() calls. The problem with the earlier proposed
approach was that we were not caching, yet returning stale information.
That implied the next call will again recompute a possibly different value.
The retry logic is trying to close a small race yet important race
condition where index set invalidation happens, but we cache stale
information.

> Do you see any need to fix
> it locally in
> RelationGetIndexAttrBitmap(), why can't we clear at transaction end?
>
>
We're trying to fix it in RelationGetIndexAttrBitmap() because that's where
the race exists. I'm not saying there aren't other and better ways of
handling it, but we don't know (I've studied Andres's proposal yet).

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2017-02-06 03:15:48 Re: Index corruption with CREATE INDEX CONCURRENTLY
Previous Message Josh Soref 2017-02-06 02:50:22 Re: Possible spelling fixes