Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Pawel Kudzia <kudzia(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Date: 2021-07-22 07:25:43
Message-ID: 9fdbb584-1e10-6a55-ecc2-9ba8b5dca1cf@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 20/07/2021 18:58, Pawel Kudzia wrote:
> On Tue, Jul 20, 2021 at 5:34 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> Pawel: Can you test once again with the attached amcheck version?
>
> patch v4-0001-Amcheck-for-GIN-13stable.patch applied to
> postgresql-13_13.3.orig.tar.bz2
> attached - log and result of the check.
>
> executing "select
> gin_index_parent_check('entity_attribute_name_ids_gin');" took
> considerably longer this time and it ended with :
>
> [524767.920498] postgres[29716]: segfault at fffffffffffffffe ip
> 00007f214c20a2d8 sp 00007ffe2a5fd040 error 5 in
> amcheck.so[7f214c209000+6000]
> [524767.920521] Code: 00 00 41 0f b7 4e 04 41 8b 56 0c 41 8b 76 10 49
> 89 c1 48 8d 04 40 c1 e1 10 45 0f b7 46 08 48 8d 7c 43 fa 41 0f b7 46
> 06 09 c1 <0f> b7 47 04 50 0f b7 07 0f b7 7f 02 c1 e0 10 09 f8 50 0f b7
> 43 04

Hmph, another set of bugs in the amcheck patch. It didn't handle empty
entry tree items, nor empty posting tree pages. And there was another
bug, which caused it to scan entry tree pages multiple times; that's why
the same WARNING was printed multiple times.

Fixed those bugs, new patch version attached. Pawel, can you test this
again, please? At this point, I'm pretty sure this isn't going to reveal
any more information about the original problem, but at least we're
ironing out bugs from the 'amcheck' patch..

I'm grasping at straws here, but here's one more thing we could try: the
query returned these incorrect tuples:

ctid | entity_id
--------------+-----------
(4002784,1) | 38048120
(4002869,14) | 95333744
(2 rows)

We know those entries are in the GIN index with key '1373', when they
shouldn't be. But I wonder if the correct keys for those tuples are
present? Pawel, can you try this, please:

-- persuade planner to use a bitmap index scan
set enable_seqscan=off;
set enable_tidscan=off;
-- avoid lossy bitmaps
set work_mem='1GB';

-- Explain analyze first to check it's using a bitmap index scan and
-- no lossy pages.
explain analyze
SELECT ctid, entity_id FROM entity WHERE attribute_name_ids &&
'{38048120}' and ctid='(4002784,1)';
SELECT ctid, entity_id FROM entity WHERE attribute_name_ids &&
'{38048120}' and ctid='(4002784,1)';

explain analyze
SELECT ctid, entity_id FROM entity WHERE attribute_name_ids &&
'{95333744}' and ctid='(4002869,14)';
SELECT ctid, entity_id FROM entity WHERE attribute_name_ids &&
'{95333744}' and ctid='(4002869,14)';

- Heikki

Attachment Content-Type Size
v5-0001-Amcheck-for-GIN-13stable.patch text/x-patch 39.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Artur Zakirov 2021-07-22 09:04:07 Re: BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events
Previous Message Peter Eisentraut 2021-07-22 06:20:28 Re: printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)