Re: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741
Date: 2012-05-30 22:00:46
Message-ID: af7eb41eabb863f1676caba13cbbe4a3.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 30, 2012 22:25, Robert Haas wrote:
> On Wed, May 30, 2012 at 2:52 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, May 30, 2012 at 1:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>> The process holding the AccessExclusiveLock is the startup process. It's
>>>> holding the lock on behalf of the transaction in the master. But something's
>>>> wrong, and the AccessExclusiveLock doesn't stop a regular backend from
>>>> acquiring the AccessShareLock on the table. I suspect the fast-path locking
>>>> patch, because this works on 9.1.
>>>
>>> Yeah, apparently so.  gdb says that FastPathStrongRelationLocks on the
>>> standby is all-zeros even after that record has been replayed.  Not
>>> sure how that's possible yet.
>>
>> Ah.  The problem is that FastPathTag() expects that locks on database
>> objects will only be taken by backends with a non-zero value for
>> MyDatabaseId.  Apparently the can-i-use-the-fastpath test and the
>> do-i-need-to-force-other-people-out-of-the-fastpath test need to be a
>> bit more asymmetrical than they are at present.
>
> I've fixed things so that Heikki's test case now behaves as expected.
> Hopefully this fixes Erik's problem as well, but I haven't tested.
>

(I double-checked that I got your latest commit in)

I'm afraid it's not yet resolved; the sync-slave still crashes almost immediately:

master logfile says:
2012-05-30 23:30:07.846 CEST 3918 LOG: standby wal_receiver_01 is now the synchronous standby
with priority 1

sync-slave logfile:

[...]
2012-05-30 23:30:07.833 CEST 3908 LOG: database system is ready to accept read only connections
cp: cannot stat `/home/aardvark/pg_stuff/archive_dir/000000010000000000000004': No such file or
directory
2012-05-30 23:30:07.845 CEST 3917 LOG: streaming replication successfully connected to primary
2012-05-30 23:40:52.635 CEST 5287 ERROR: could not open relation with OID 26563
2012-05-30 23:40:52.635 CEST 5287 STATEMENT: select current_setting('port') port, count(*) from
public.t
2012-05-30 23:40:57.909 CEST 3909 FATAL: could not open file "base/21268/26569": No such file or
directory
2012-05-30 23:40:57.909 CEST 3909 CONTEXT: writing block 5152 of relation base/21268/26569
xlog redo multi-insert (init): rel 1663/21268/26581; blk 3852; 35 tuples
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741)
2012-05-30 23:40:58.006 CEST 5331 FATAL: could not open file "base/21268/26569": No such file or
directory
2012-05-30 23:40:58.006 CEST 5331 CONTEXT: writing block 5153 of relation base/21268/26569
2012-05-30 23:40:59.661 CEST 3908 LOG: startup process (PID 3909) was terminated by signal 6:
Aborted
2012-05-30 23:40:59.661 CEST 3908 LOG: terminating any other active server processes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2012-05-30 22:01:04 Re: Early hint bit setting
Previous Message Bruce Momjian 2012-05-30 21:55:07 Re: Figuring out shared buffer pressure