Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: yamt <yamt(at)mwd(dot)biglobe(dot)ne(dot)jp>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Date: 2011-09-26 21:29:53
Message-ID: 1317070986-sup-4844@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers


Excerpts from Tom Lane's message of lun sep 26 13:26:37 -0300 2011:
>
> yamt(at)mwd(dot)biglobe(dot)ne(dot)jp (YAMAMOTO Takashi) writes:
> >> Maybe, but I'd still like to see a test case, because I can't reproduce
> >> any such problem by preparing ROLLBACK in an aborted transaction.
>
> > reading GetTransactionSnapshot, it seems that the problem happens
> > only with IsolationUsesXactSnapshot() true.
>
> Hmm. I'm inclined to think that this demonstrates a bug in snapshot
> management, not so much in plancache. We have plancache doing
>
> PushActiveSnapshot(GetTransactionSnapshot());
>
> and then later
>
> PopActiveSnapshot();
>
> and at this point surely it is not plancache's fault if there is any
> remaining refcount for the snapshot. There is, though, because
> GetTransactionSnapshot saved a refcount in TopTransactionResourceOwner.
> I think it's snapmgr.c's responsibility to make sure that that's cleaned
> up, and it's not doing so.

Agreed.

> The place where that refcount normally gets dropped is
> AtEarlyCommit_Snapshot, but that isn't going to be called at all in
> aborted-transaction cleanup. Worse, if we just transposed it over to be
> called in a place in AbortTransaction comparable to where it's called
> during commit, that still wouldn't fix the problem, because when the
> ROLLBACK happens, we've already aborted the transaction.

... ouch.

> I think that AtEarlyCommit_Snapshot is misdesigned, and that far from
> being done "early" in commit/abort, it needs to be done "late", like
> somewhere not very long before the
> ResourceOwnerDelete(TopTransactionResourceOwner) calls. There is no
> very good reason to think that someone might not ask for a snapshot
> during commit processing.
>
> Alvaro, do you happen to remember why this got designed as an "early"
> transaction shutdown action, rather than delaying it as long as
> possible?

As far as I remember, the only principle was that it had to run before
ResourceOwner cleanup. Commit 7b640b0345dc4fbd39ff568700985b432f6afa07
introduces that "early" call; ResOwner support had been introduced 10
days before in 6bbef4e5383c99d93aa974e2c79d328cfbd1c4a9. I probably
just tried it out and noticed that resowner.c complained if I didn't
drop the refcount prior to its own cleanup.

I don't think I ever considered the scenario of calls in aborted
transactions.

Shall I work on a fix? I expect you are plenty busy with commitfest
stuff, but please let me know otherwise.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2011-09-26 21:46:12 Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Previous Message Alvaro Herrera 2011-09-26 20:10:45 Re: BUG #6226: Broken foreign key stored on database (parent deleted with children still readable, BUG#6225 Update)

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-26 21:46:12 Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Previous Message Andrew Dunstan 2011-09-26 21:16:01 Re: psql setenv command