Re: crash testing suggestions for 12 beta 1

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: crash testing suggestions for 12 beta 1
Date: 2019-05-23 15:55:02
Message-ID: CAH2-Wz=BLfTTKbvvAm0gN=M-Fw-9opyMwavkCSp8V4a3_1F6fg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 23, 2019 at 8:24 AM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> Now that beta is out, I wanted to do some crash-recovery testing where I inject PANIC-inducing faults and see if it recovers correctly.

Thank you for doing this. It's important work.

> Making the ctid be tie-breakers in btree index is also tested inherently (plus I think Peter tested that pretty thoroughly himself with similar methods).

As you may know, the B-Tree code has a tendency to soldier on when an
index is corrupt. "Moving right" tends to conceal problems beyond
concurrent page splits. I didn't do very much fault injection type
testing with the B-Tree enhancements, but I did lean on amcheck
heavily during development. Note that a new, extremely thorough option
called "rootdescend" verification was added following the v12 work:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c1afd175b5b2e5c44f6da34988342e00ecdfb518

It probably wouldn't add noticeable overhead to use this during your
testing, and maybe to combine it with the "heapallindexed" option,
while using the bt_index_parent_check() variant -- that will detect
almost any imaginable index corruption. Admittedly, amcheck didn't
find any bugs in my code after the first couple of versions of the
patch series, so this approach seems unlikely to find any problems
now. Even still, it wouldn't be very difficult to do this extra step.
It seems worthwhile to be thorough here, given that we depend on the
B-Tree code so heavily.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-05-23 15:55:47 Re: with oids option not removed in pg_dumpall
Previous Message Andres Freund 2019-05-23 15:54:59 Re: Read-only access to temp tables for 2PC transactions