Re: [BUG] Error in BRIN summarization

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [BUG] Error in BRIN summarization
Date: 2020-07-30 13:40:46
Message-ID: 2588667e-d07d-7e10-74e2-7e1e46194491@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27.07.2020 20:25, Alvaro Herrera wrote:
> On 2020-Jul-27, Anastasia Lubennikova wrote:
>
>> Here is the updated version of the fix.
>> The problem can be reproduced on all supported versions, so I suggest to
>> backpatch it.
>> Code slightly changed in v12, so here are two patches: one for versions 9.5
>> to 11 and another for versions from 12 to master.
>>
>>
>>
>> (I was also considering whether it needs to be a loop to reobtain root
>> tuples, in case a concurrent transaction can create a new item while
>> we're checking that item; but I don't think that can really happen for
>> one individual tuple.)
I don't think we need a recheck for a single tuple, because we only care
about finding its root, which simply must exist somewhere on this page,
as concurrent pruning is not allowed. We also may catch root_offsets[]
for subsequent tuples, but it's okay if we don't. These tuples will do
the same recheck on their turn.

While testing this fix, Alexander Lakhin spotted another problem. I
simplified  the test case to this:

1) prepare a table with brin index

create table tbl (iint, bchar(1000)) with (fillfactor=10);
insert into tbl select i, md5(i::text) from generate_series(0,1000) as i;
create index idx on tbl using brin(i, b) with (pages_per_range=1);

2) run brin_desummarize_range() in a loop:

echo "-- desummarize all ranges
SELECT FROM generate_series(0, pg_relation_size('tbl')/8192 - 1) as i, lateral (SELECT brin_desummarize_range('idx', i)) as d;
-- summarize them back
VACUUM tbl" > brin_desum_test.sql

pgbench postgres -f  brin_desum_test.sql -n -T 120

3) run a search that invokes bringetbitmap:

set enable_seqscan to off;
 explain analyze select * from tbl where i>10 and i<100; \watch 1

After a few runs, it will fail with "ERROR: corrupted BRIN index:
inconsistent range map"

The problem is caused by a race in page locking in
brinGetTupleForHeapBlock [1]:

(1) bitmapsan locks revmap->rm_currBuf and finds the address of the
tuple on a regular page "page", then unlocks revmap->rm_currBuf
(2) in another transaction desummarize locks both revmap->rm_currBuf and
"page", cleans up the tuple and unlocks both buffers
(1) bitmapscan locks buffer, containing "page", attempts to access the
tuple and fails to find it

At first, I tried to fix it by holding the lock on revmap->rm_currBuf
until we locked the regular page, but it causes a deadlock with
brinsummarize(), It can be easily reproduced with the same test as above.
Is there any rule about the order of locking revmap and regular pages in
brin? I haven't found anything in README.

As an alternative, we can leave locks as is and add a recheck, before
throwing an error.

What do you think?

[1]
https://github.com/postgres/postgres/blob/master/src/backend/access/brin/brin_revmap.c#L269

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-07-30 14:22:33 Re: Missing CFI in hlCover()?
Previous Message Amit Kapila 2020-07-30 13:12:44 Re: INSERT INTO SELECT, Why Parallelism is not selected?