Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: sivasubr(at)amazon(dot)com
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
Date: 2018-09-05 20:54:54
Message-ID: CAPpHfduRz9qeqxVQh+2FAj+uQ040qzijYOM6Lc2o7873UzVL3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 5, 2018 at 5:05 PM Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On Wed, Sep 5, 2018 at 2:39 PM Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > On Wed, Sep 5, 2018 at 12:26 PM Alexander Korotkov
> > <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > > On Wed, Sep 5, 2018 at 1:45 AM R, Siva <sivasubr(at)amazon(dot)com> wrote:
> > > > On Tue, Sep 4, 2018 at 09:16 PM, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > > > > Do you have a test scenario for reproduction of this issue? We need
> > > > > it to ensure that fix is correct.
> > > >
> > > > Unfortunately, I do not have a way of reproducing this issue.
> > > > So far I have tried a workload consisting of inserts (of the
> > > > same attribute value that is indexed), batch deletes of rows
> > > > and vacuum interleaved with engine crash/restarts.
> > >
> > > Issue reproduction and testing is essential for bug fix. Remember
> > > last time you reported GIN bug [1], after issue reproduction it
> > > appears that we have more things to fix. I's quite clear for me that
> > > if segment list contains GIN_SEGMENT_INSERT before GIN_SEGMENT_DELETE,
> > > then it might lead to wrong behavior in ginRedoRecompress(). But it's
> > > not yet clear to understand what code patch could lead to such segment
> > > list... I'll explore code more and probably will come with some idea.
> >
> > Aha, I've managed to reproduce this.
> > 1. Apply ginRedoRecompress-asserts.patch, which add assertions to
> > ginRedoRecompress() detecting past opaque writes.
>
> It was wrong, sorry. It appears that I put strict inequality into
> asserts, while there should be loose inequality. Correct version of
> patch is attached. And scenario I've posted isn't really reproducing
> the bug...

Finally I managed to reproduce the bug. The scenario is following.
Underlying idea is that when insertion of multiple tuples goes to the
beginning of the page and this insertion succeed only thanks to
collapse of some short segments together, then this insertion wouldn't
fit to the page if given alone.

create table test (i integer primary key, a int[]) with (fillfactor = 50);
insert into test (select id, array[id%2]::int[] from
generate_series(1,15376) id);
create index test_idx on test using gin(a) with (fastupdate = off);
update test set a = '{1}' where i % 4 = 0 or i % 16 = 2 or i % 64 in
(6, 46, 36) or i % 256 = 54;
vacuum test;
insert into test (select id + 16376, '{0}' from generate_series(1,5180) id);
update test set a = '{1}' where i <= 15376 and i % 256 in (182, 198);
vacuum test;
alter index test_idx set (fastupdate = on);
delete from test where i <= 134 and a = '{1}';
vacuum test;
insert into test (select id+30000, '{0}' from generate_series(1,110) id);
vacuum test;

With ginRedoRecompress-asserts-v2.patch following assertion is fired.
TRAP: FailedAssertion("!(segptr + newsegsize + (szleft - segsize) <= (
((void) ((_Bool) (! (!(PageValidateSpecialPointer(page))) ||
(ExceptionalCondition("!(PageValidateSpecialPointer(page))",
("FailedAssertion"), "ginxlog.c", 289), 0)))), (char *) ((char *)
(page) + ((PageHeader) (page))->pd_special) ))", File: "ginxlog.c",
Line: 289)

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2018-09-05 21:09:59 *_to_xml() should copy SPI_processed/SPI_tuptable
Previous Message Andres Freund 2018-09-05 20:33:21 Re: On the need for a snapshot in exec_bind_message()