| From: | Andrew Dunstan <andrew(at)dunslane(dot)net> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: SR standby hangs | 
| Date: | 2011-04-26 20:44:15 | 
| Message-ID: | 4DB72E9F.8000000@dunslane.net | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 04/26/2011 04:28 PM, Tom Lane wrote:
> Andrew Dunstan<andrew(at)dunslane(dot)net>  writes:
>> This has happened again. This time we have some debug info available,
>> and can possible get more, if people tell me what will be helpful:
>>      (gdb) f 2
>>      #2  0x00000000005de735 in LockBufferForCleanup (buffer=310163) at
>>      bufmgr.c:2432
>>      2432                ProcWaitForSignal();
>>      (gdb) p *bufHdr
>>      $2 = {tag = {rnode = {spcNode = 16393, dbNode = 40475, relNode =
>>      41880}, forkNum = MAIN_FORKNUM, blockNum = 18913}, flags = 6,
>>      usage_count = 1, refcount = 1, wait_backend_pid = 9111,
>>         buf_hdr_lock = 0 '\000', buf_id = 310162, freeNext = -2,
>>      io_in_progress_lock = 620448, content_lock = 620449}
> Well, that's pretty interesting: refcount is only 1, and the
> BM_PIN_COUNT_WAITER flag is not set.
I noticed that.
> AFAICS this *must* mean that the
> buffer had been pinned and whoever had it (presumably bgwriter) did
> UnpinBuffer().  So it appears that the signal just plain got lost :-(,
> which suggests a kernel bug.  What platform is this on, again?
CentOS 5.5, x86_64, kernel 2.6.18-194.32.1.el5
cheers
andrew
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Smith | 2011-04-26 20:44:31 | Re: Improving the memory allocator | 
| Previous Message | Tomas Vondra | 2011-04-26 20:39:19 | Re: offline consistency check and info on attributes |