Re: Deadlock in multiple CIC.

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deadlock in multiple CIC.
Date: 2018-04-17 18:13:30
Message-ID: 20180417181330.g53voqyys6m2vgwc@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

> It's still not entirely clear what's happening on okapi, but in the
> meantime I've thought of an easily-reproducible way to cause similar
> failures in any branch. That is to run CREATE INDEX CONCURRENTLY
> with default_transaction_isolation = serializable. Then, snapmgr.c
> will set up a transaction snapshot (actually identical to the
> "reference snapshot" used by DefineIndex), and that will not get
> released, so the process's xmin doesn't get cleared, and we have
> a deadlock hazard.

Hah, ouch.

> I experimented with running the isolation tests under "alter system set
> default_transaction_isolation to serializable". Oddly, multiple-cic
> tends to not fail that way for me, though if I reduce the
> isolation_schedule file to contain just that one test, it fails nine
> times out of ten. Leftover activity from the previous tests must be
> messing up the timing somehow. Anyway, the problem is definitely real.
> (A couple of the other isolation tests do fail reliably under this
> scenario; is it worth hardening them?)

Yes, I think it's worth making them pass somehow -- see commits
f18795e7b74c, a0eae1a2eeb6.

> I thought for a bit about trying to force C.I.C.'s transactions to
> be run with a lower transaction isolation level, but that seems messy
> and I'm not very sure it wouldn't have bad side-effects. A much simpler
> fix is to just start YA transaction before waiting, as in the attached
> proposed patch. (With the transaction restart, I feel sufficiently
> confident that there should be no open snapshots that it seems okay
> to put in the Assert I was previously afraid to add.)

Seems like an acceptable fix to me.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-04-17 18:13:57 Re: reloption to prevent VACUUM from truncating empty pages at the end of relation
Previous Message Tom Lane 2018-04-17 18:09:53 Re: reloption to prevent VACUUM from truncating empty pages at the end of relation