| From: | Merlin Moncure <mmoncure(at)gmail(dot)com> | 
|---|---|
| To: | Peter Geoghegan <pg(at)heroku(dot)com> | 
| Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: hung backends stuck in spinlock heavy endless loop | 
| Date: | 2015-01-28 18:26:01 | 
| Message-ID: | CAHyXU0zRvKFAYozRsa_tm19zA2hAKiEUi0GU4xbWCvuyUaGFjw@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Wed, Jan 28, 2015 at 8:05 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Thu, Jan 22, 2015 at 3:50 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> I still haven't categorically ruled out pl/sh yet; that's something to
>> keep in mind.
>
> Well, after bisection proved not to be fruitful, I replaced the pl/sh
> calls with dummy calls that approximated the same behavior and the
> problem went away.  So again, it looks like this might be a lot of
> false alarm.  A pl/sh driven failure might still be interesting if
> it's coming from the internal calls it's making, so I'm still chasing
> it down.
...hm, I spoke to soon.  So I deleted everything, and booted up a new
instance 9.4 vanilla with asserts on and took no other action.
Applying the script with no data activity fails an assertion every
single time:
mmoncure(at)mernix2 12:25 PM (REL9_4_STABLE) ~/src/p94$ cat
/mnt/ssd/data/pg_log/postgresql-28.log
[ 12287 2015-01-28 12:24:24.080 CST 0]LOG:  received smart shutdown request
[ 13516 2015-01-28 12:24:24.080 CST 0]LOG:  autovacuum launcher shutting down
[ 13513 2015-01-28 12:24:24.081 CST 0]LOG:  shutting down
[ 13513 2015-01-28 12:24:24.083 CST 0]LOG:  database system is shut down
[ 14481 2015-01-28 12:24:25.127 CST 0]LOG:  database system was shut
down at 2015-01-28 12:24:24 CST
[ 14457 2015-01-28 12:24:25.129 CST 0]LOG:  database system is ready
to accept connections
[ 14485 2015-01-28 12:24:25.129 CST 0]LOG:  autovacuum launcher started
TRAP: FailedAssertion("!(flags & 0x0010)", File: "dynahash.c", Line: 330)
[ 14457 2015-01-28 12:24:47.983 CST 0]LOG:  server process (PID 14545)
was terminated by signal 6: Aborted
[ 14457 2015-01-28 12:24:47.983 CST 0]DETAIL:  Failed process was
running: SELECT CDSStartRun()
[ 14457 2015-01-28 12:24:47.983 CST 0]LOG:  terminating any other
active server processes
[cds2 14546 2015-01-28 12:24:47.983 CST 0]WARNING:  terminating
connection because of crash of another server process
[cds2 14546 2015-01-28 12:24:47.983 CST 0]DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
[cds2 14546 2015-01-28 12:24:47.983 CST 0]HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
[ 14485 2015-01-28 12:24:47.983 CST 0]WARNING:  terminating connection
because of crash of another server process
[ 14485 2015-01-28 12:24:47.983 CST 0]DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
[ 14485 2015-01-28 12:24:47.983 CST 0]HINT:  In a moment you should be
able to reconnect to the database and repeat your command.
[ 14457 2015-01-28 12:24:47.984 CST 0]LOG:  all server processes
terminated; reinitializing
[ 14554 2015-01-28 12:24:47.995 CST 0]LOG:  database system was
interrupted; last known up at 2015-01-28 12:24:25 CST
[ 14554 2015-01-28 12:24:47.995 CST 0]LOG:  database system was not
properly shut down; automatic recovery in progress
[ 14554 2015-01-28 12:24:47.997 CST 0]LOG:  invalid magic number 0000
in log segment 000000010000000000000001, offset 13000704
[ 14554 2015-01-28 12:24:47.997 CST 0]LOG:  redo is not required
[ 14457 2015-01-28 12:24:48.000 CST 0]LOG:  database system is ready
to accept connections
[ 14558 2015-01-28 12:24:48.000 CST 0]LOG:  autovacuum launcher started
merlin
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2015-01-28 18:30:16 | Re: jsonb, unicode escapes and escaped backslashes | 
| Previous Message | Andres Freund | 2015-01-28 18:17:05 | Re: Misaligned BufferDescriptors causing major performance problems on AMD |