Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Date: 2017-10-31 22:44:36
Message-ID: 083d996a-4a8a-0e13-800a-851dd09ad8cc@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 10/31/2017 08:46 PM, Tom Lane wrote:
> I wrote:
>> maybe
>> we just have some run-of-the-mill bugs to find, like the off-the-end
>> bug I spotted in brin_doupdate. There's apparently at least one
>> more, but given the error message it must be something like not
>> checking for a page to have turned into a revmap page. Shouldn't
>> be too hard to find...
>
> Actually, I think it might be as simple as the attached.
> brin_getinsertbuffer checks for the old page having turned into revmap,
> but the "samepage" path in brin_doupdate does not :-(
>
> With this applied, Alvaro's version of the test case has survived
> without error for quite a bit longer than its former MTBF. There
> might still be some issues though in other code paths.
>

That does fix the crashes for me - I've been unable to reproduce any
even after one hour (it took a couple of minutes to crash before).

Unfortunately, I think we still have a problem ... I've been wondering
if we end up producing correct indexes, so I've done a simple test.

1) create the table as before

2) let the insert + vacuum run for some time, to see if there are
crashes (result: no crashes after one hour, inserting ~92M rows)

3) do a bunch of random updates on the data (while still doing the
concurrent vacuum in another session)

4) run a bunch of simple queries to compare the results, essentially

-- BRIN index
SET enable_bitmapscan = on;
SELECT COUNT(*) FROM brin_test WHERE a = $1;

-- seq scan
SET enable_bitmapscan = on;
SELECT COUNT(*) FROM brin_test WHERE a = $1;

and unfortunately what I get is not particularly pleasant:

test=# set enable_bitmapscan = on;
SET
test=# select count(*) from brin_test where a = 0;
count
-------
9062
(1 row)

test=# set enable_bitmapscan = off;
SET
test=# select count(*) from brin_test where a = 0;
count
-------
9175
(1 row)

Attached is a SQL script with commands I used. You'll need to copy the
commands into multiple psql sessions, though, to simulate concurrent
activity).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
brin-test.sql application/sql 1.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2017-10-31 22:45:28 Re: PostgreSQL 10 parenthesized single-column updates can produce errors
Previous Message Tom Lane 2017-10-31 22:43:59 Re: PostgreSQL 10 parenthesized single-column updates can produce errors