Re: Deadlock in multiple CIC.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deadlock in multiple CIC.
Date: 2018-04-18 15:27:49
Message-ID: 8071.1524065269@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
>> It's still not entirely clear what's happening on okapi, ...

okapi has now passed two consecutive runs with elog(LOG) messages in place
between DefineIndex's snapmgr calls. Considering that it had failed 37 of
44 test runs since 47a3a13 went in, I think two successive passes is
sufficient evidence to conclude that we have a Heisenbug in which the
presence of debug tooling affects the result. And that in turn suggests
strongly that it's a compiler bug. Broken interprocedural optimization,
perhaps? Although it'd have to be cross-file optimization, which is
more than I thought icc would do.

Anyway, at this point I'm going to give up on the debug logging, revert
9.4 to its prior state, and then see if the transaction-restart patch
makes the problem go away.

>> (A couple of the other isolation tests do fail reliably under this
>> scenario; is it worth hardening them?)

> Yes, I think it's worth making them pass somehow -- see commits
> f18795e7b74c, a0eae1a2eeb6.

Will look into that too. I'm not sure that adding extra expected
outputs is sane, though --- might be best to just force the intended
isolation level within those tests.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Joseph Krogh 2018-04-18 15:35:31 Query is over 2x slower with jit=on
Previous Message Robert Haas 2018-04-18 15:20:30 Re: pgindent run soon?