From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
Cc: | Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Microvacuum for gist. |
Date: | 2015-09-16 04:30:11 |
Message-ID: | CAMkU=1zjJdVXbQ4xKLTBDoD3BosA3SaqyBMn6nYbfaeY8U-YgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 8, 2015 at 2:35 PM, Anastasia Lubennikova <
a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>
> Fixed patch is attached.
>
>
The commit of this patch seems to have created a bug in which updated
tuples can disappear from the index, while remaining in the table.
It looks like the bug depends on going through a crash-recovery cycle, but
I am not sure of that yet.
I've looked through the commit diff and don't see anything obviously
wrong. I notice index tuples are marked dead with only a buffer content
share lock, and the page is defragmented with only a buffer exclusive lock
(as opposed to a super-exclusive buffer clean up lock). But as far as I
can tell, both of those should be safe on an index. Also, if that was the
bug, it should happen without crash-recovery.
The test is pretty simple. I create a 10,000 row table with a
unique-by-construction id column with a btree_gist index on it and a
counter column, and fire single-row updates of the counter for random ids
in high concurrency (8 processes running flat out). I force the server to
crash frequently with simulated torn-page writes in which md.c writes a
partial page and then PANICs. Eventually (1 to 3 hours) the updates start
indicating they updated 0 rows. At that point, a forced table scan will
find the row, but the index doesn't.
Any hints on how to proceed with debugging this? If I can't get it to
reproduce the problem in the absence of crash-recovery cycles with an
overnight run, then I think my next step will be to run it over hot-standby
and see if WAL replay in the absence of crashes might be broken as well.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2015-09-16 04:49:33 | Re: proposal: function parse_ident |
Previous Message | Peter Eisentraut | 2015-09-16 04:16:14 | Re: Can extension build own SGML document? |