Re: random isolation test failures

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kevin Grittner <kevin(dot)grittner(at)wicourts(dot)gov>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: random isolation test failures
Date: 2011-09-27 03:29:00
Message-ID: 1317093816-sup-4240@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Excerpts from Noah Misch's message of lun sep 26 21:57:40 -0300 2011:

> These sporadic failures happen whenever the test case takes longer than
> deadlock_timeout (currently 100ms for these tests) to setup the deadlock. I
> outlined some mitigating strategies here:
> http://archives.postgresql.org/message-id/20110727171438.GE18910@tornado.leadboat.com
>
> I'd vote for #1: let's double the deadlock_timeout until the failures stop.
> Other opinions?

I just tweaked isolationtester so that it collects the error messages
and displays them all together at the end of the test. After seeing it
run, I didn't like it -- I think I prefer something more local, so that
in the only case where we call try_complete_step twice in the loop, we
report any errors in either. AFAICS this would make both expected cases
behave identically in test output. The only thing left to figure out is
where to store the error message between calls ... clearly Step is not
the right place for it. I'm on it now, anyway.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment Content-Type Size
isolation-fix.patch application/octet-stream 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-27 04:05:44 Re: bug of recovery?
Previous Message Fujii Masao 2011-09-27 02:56:25 Re: Online base backup from the hot-standby