Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Date: 2017-10-31 04:56:40
Message-ID: 28663.1509425800@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> Tom Lane wrote:
>> So: I put the blame on the fact that summarize_range() thinks that
>> the tuple offset it has for the placeholder tuple is guaranteed to
>> hold good, even across possibly-long intervals where it's holding
>> no lock on the containing buffer.

> Yeah, I think this is a pretty reasonable explanation for the problem.
> I don't understand why it doesn't fail in 9.6.

Yeah, we're still missing an understanding of why we didn't see it
before; the inadequate locking was surely there before. I'm guessing
that somehow the previous behavior of PageIndexDeleteNoCompact managed
to mask the problem (perhaps only by not throwing an error, which doesn't
imply that the index state was good afterwards). But I don't see quite
how it did that.

One thing I think we do know now is that the bug is triggered by two
concurrent executions of summarize_range. So I'd look for edge cases
like whether the placeholder tuple can be deleted and then reinserted
into the same lp index.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-10-31 05:50:06 Re: Current int & float overflow checking is slow.
Previous Message amul sul 2017-10-31 04:47:33 Re: [POC] hash partitioning