Re: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741
Date: 2012-05-30 17:07:37
Message-ID: 4FC653D9.9000308@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26.05.2012 12:21, Erik Rijkers wrote:
> But when that if-block is added the client crashes after a while (sometimes almost immediately; it
> never survives longer then 20 minutes):
>
> 2012-05-26 10:44:22.617 CEST 10274 ERROR: could not fsync file "base/21268/32807": No such file
> or directory
> 2012-05-26 10:44:28.465 CEST 10274 ERROR: could not fsync file "base/21268/32867": No such file
> or directory
> 2012-05-26 10:44:28.587 CEST 10270 FATAL: could not open file "base/21268/32994": No such file or
> directory
> 2012-05-26 10:44:28.588 CEST 10270 CONTEXT: writing block 2508 of relation base/21268/32994
> xlog redo multi-insert (init): rel 1663/21268/33006; blk 3117; 58 tuples
> TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741)
> 2012-05-26 10:44:31.131 CEST 10269 LOG: startup process (PID 10270) was terminated by signal 6:
> Aborted
> 2012-05-26 10:44:31.131 CEST 10269 LOG: terminating any other active server processes
>
>
> Crazy scenario , I'll admit, but surely this shouldn't be able to crash the client?

Thanks for the report. I was able to reproduce this with that script,
and I think I see what's going on now.

There's something wrong with the way AccessExclusiveLocks work on a
standby. I did "begin; truncate foo; -- leave the xact open" in the
master, and waited until the xlog records are shipped to the standby.
Then I did this in the standby:

testdb=# begin;
BEGIN
testdb=# select * from foo;
id
----
(0 rows)

testdb=# select locktype, database, relation, virtualtransaction, pid,
mode, granted, fastpath from pg_locks where locktype='relation' and
relation='foo'::regclass;
locktype | database | relation | virtualtransaction | pid |
mode | granted | fastpath
----------+----------+----------+--------------------+-------+---------------------+---------+----------
relation | 16384 | 27332 | 2/78 | 24984 |
AccessShareLock | t | t
relation | 16384 | 27332 | 1/0 | 24344 |
AccessExclusiveLock | t | f
(2 rows)

The "select * from foo" query should have blocked, because the
transaction in the master is holding an AccessExclusiveLock on the table.

The process holding the AccessExclusiveLock is the startup process. It's
holding the lock on behalf of the transaction in the master. But
something's wrong, and the AccessExclusiveLock doesn't stop a regular
backend from acquiring the AccessShareLock on the table. I suspect the
fast-path locking patch, because this works on 9.1.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-05-30 17:11:07 Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Previous Message Sergey Koposov 2012-05-30 16:58:02 Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile