Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: yamt(at)mwd(dot)biglobe(dot)ne(dot)jp (YAMAMOTO Takashi)
Cc: pgsql-bugs(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Date: 2011-09-26 16:26:37
Message-ID: 8244.1317054397@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

yamt(at)mwd(dot)biglobe(dot)ne(dot)jp (YAMAMOTO Takashi) writes:
>> Maybe, but I'd still like to see a test case, because I can't reproduce
>> any such problem by preparing ROLLBACK in an aborted transaction.

> reading GetTransactionSnapshot, it seems that the problem happens
> only with IsolationUsesXactSnapshot() true.

Hmm. I'm inclined to think that this demonstrates a bug in snapshot
management, not so much in plancache. We have plancache doing

PushActiveSnapshot(GetTransactionSnapshot());

and then later

PopActiveSnapshot();

and at this point surely it is not plancache's fault if there is any
remaining refcount for the snapshot. There is, though, because
GetTransactionSnapshot saved a refcount in TopTransactionResourceOwner.
I think it's snapmgr.c's responsibility to make sure that that's cleaned
up, and it's not doing so.

The place where that refcount normally gets dropped is
AtEarlyCommit_Snapshot, but that isn't going to be called at all in
aborted-transaction cleanup. Worse, if we just transposed it over to be
called in a place in AbortTransaction comparable to where it's called
during commit, that still wouldn't fix the problem, because when the
ROLLBACK happens, we've already aborted the transaction.

I think that AtEarlyCommit_Snapshot is misdesigned, and that far from
being done "early" in commit/abort, it needs to be done "late", like
somewhere not very long before the
ResourceOwnerDelete(TopTransactionResourceOwner) calls. There is no
very good reason to think that someone might not ask for a snapshot
during commit processing.

Alvaro, do you happen to remember why this got designed as an "early"
transaction shutdown action, rather than delaying it as long as
possible?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2011-09-26 16:32:46 Re: BUG #6222: Segmentation fault on unlogged table
Previous Message Robert Haas 2011-09-26 16:20:22 Re: BUG #6222: Segmentation fault on unlogged table

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-26 17:10:27 Re: random isolation test failures
Previous Message Alvaro Herrera 2011-09-26 16:16:24 Multixact truncation for FK locks patch