Re: Duplicate Item Pointers in Gin index

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, "R, Siva" <sivasubr(at)amazon(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Duplicate Item Pointers in Gin index
Date: 2018-06-13 15:20:56
Message-ID: CAD21AoCGHkKtwfHkCZ-5F8f-DWZ4h7stDiKJ=TK01RcsaSp_gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 13, 2018 at 10:22 PM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On Wed, Jun 13, 2018 at 11:40 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>
>> On Wed, Jun 13, 2018 at 3:32 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> > On Tue, Jun 12, 2018 at 11:01 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> >> FWIW, I've looked at this again. I think that the situation Siva
>> >> reported in the first mail can happen before we get commit 3b2787e.
>> >> That is, gin indexes had had a data corruption bug. I've reproduced
>> >> the situation with PostgreSQL 10.1 and observed that a gin index can
>> >> corrupt.
>> >
>> > So, you've recreated the problem with Postgres from before 3b2787e,
>> > but not after 3b2787e? Are you suggesting that 3b2787e might have
>> > fixed it, or that it only hid the problem, or something else?
>>
>> I meant 3b2787e fixed it. I checked that at least the situation
>> doesn't happen after 3b2787e.
>
> I also think that 3b2787e should fix such problems. After 3b2787e,
> vacuum is forced to cleanup all pending list entries, which were
> inserted before vacuum start. So, vacuum should have everything to be
> vaccumed merged into posting lists/trees.
>
>> > How did you recreate the problem? Do you have a test case you can share?
>>
>> I recreated it by executing each steps step by step using gdb. So I
>> can share the test case but it might not help.
>>
>> create extension pageinspect;
>> create table g (c int[]);
>> insert into g select ARRAY[1] from generate_series(1,1000);
>> create index g_idx on g using gin (c);
>> alter table g set (autovacuum_enabled = off);
>> insert into g select ARRAY[1] from generate_series(1, 408); -- 408
>> items fit in exactly one page of pending list
>> insert into g select ARRAY[1] from generate_series(1, 100); -- insert
>> into 2nd page of pending list
>> select n_pending_pages, n_pending_tuples from
>> gin_metapage_info(get_raw_page('g_idx', 0));
>> insert into g select ARRAY[999]; -- insert into 2nd pending list page
>> delete from g where c = ARRAY[999];
>> -- At this point, gin entry of 'ARRAY[999]' exists on 2nd page of
>> pending list and deleted.
>
> Is this test case completed? It looks like there should be a
> continuation with concurrent vacuum and insertions managed by gdb...
>

This is not completed test case. This is only for step 1 and we need
concurrent vacuum and insertions as you said.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-06-13 16:10:40 Re: why partition pruning doesn't work?
Previous Message Simon Riggs 2018-06-13 15:04:19 Re: Index maintenance function for BRIN doesn't check RecoveryInProgress()