Re: [PATCH] Microvacuum for gist.

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Microvacuum for gist.
Date: 2015-09-16 04:30:11
Message-ID: CAMkU=1zjJdVXbQ4xKLTBDoD3BosA3SaqyBMn6nYbfaeY8U-YgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 8, 2015 at 2:35 PM, Anastasia Lubennikova <
a(dot)lubennikova(at)postgrespro(dot)ru> wrote:

>
> Fixed patch is attached.
>
>
The commit of this patch seems to have created a bug in which updated
tuples can disappear from the index, while remaining in the table.

It looks like the bug depends on going through a crash-recovery cycle, but
I am not sure of that yet.

I've looked through the commit diff and don't see anything obviously
wrong. I notice index tuples are marked dead with only a buffer content
share lock, and the page is defragmented with only a buffer exclusive lock
(as opposed to a super-exclusive buffer clean up lock). But as far as I
can tell, both of those should be safe on an index. Also, if that was the
bug, it should happen without crash-recovery.

The test is pretty simple. I create a 10,000 row table with a
unique-by-construction id column with a btree_gist index on it and a
counter column, and fire single-row updates of the counter for random ids
in high concurrency (8 processes running flat out). I force the server to
crash frequently with simulated torn-page writes in which md.c writes a
partial page and then PANICs. Eventually (1 to 3 hours) the updates start
indicating they updated 0 rows. At that point, a forced table scan will
find the row, but the index doesn't.

Any hints on how to proceed with debugging this? If I can't get it to
reproduce the problem in the absence of crash-recovery cycles with an
overnight run, then I think my next step will be to run it over hot-standby
and see if WAL replay in the absence of crashes might be broken as well.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-09-16 04:49:33 Re: proposal: function parse_ident
Previous Message Peter Eisentraut 2015-09-16 04:16:14 Re: Can extension build own SGML document?