From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Deadlock in multiple CIC. |
Date: | 2018-04-17 18:13:30 |
Message-ID: | 20180417181330.g53voqyys6m2vgwc@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> It's still not entirely clear what's happening on okapi, but in the
> meantime I've thought of an easily-reproducible way to cause similar
> failures in any branch. That is to run CREATE INDEX CONCURRENTLY
> with default_transaction_isolation = serializable. Then, snapmgr.c
> will set up a transaction snapshot (actually identical to the
> "reference snapshot" used by DefineIndex), and that will not get
> released, so the process's xmin doesn't get cleared, and we have
> a deadlock hazard.
Hah, ouch.
> I experimented with running the isolation tests under "alter system set
> default_transaction_isolation to serializable". Oddly, multiple-cic
> tends to not fail that way for me, though if I reduce the
> isolation_schedule file to contain just that one test, it fails nine
> times out of ten. Leftover activity from the previous tests must be
> messing up the timing somehow. Anyway, the problem is definitely real.
> (A couple of the other isolation tests do fail reliably under this
> scenario; is it worth hardening them?)
Yes, I think it's worth making them pass somehow -- see commits
f18795e7b74c, a0eae1a2eeb6.
> I thought for a bit about trying to force C.I.C.'s transactions to
> be run with a lower transaction isolation level, but that seems messy
> and I'm not very sure it wouldn't have bad side-effects. A much simpler
> fix is to just start YA transaction before waiting, as in the attached
> proposed patch. (With the transaction restart, I feel sufficiently
> confident that there should be no open snapshots that it seems okay
> to put in the Assert I was previously afraid to add.)
Seems like an acceptable fix to me.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2018-04-17 18:13:57 | Re: reloption to prevent VACUUM from truncating empty pages at the end of relation |
Previous Message | Tom Lane | 2018-04-17 18:09:53 | Re: reloption to prevent VACUUM from truncating empty pages at the end of relation |