Re: anole: assorted stability problems

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: anole: assorted stability problems
Date: 2015-05-26 03:05:51
Message-ID: 20150526030551.GU32396@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-05-20 16:21:57 -0300, Alvaro Herrera wrote:
> In HEAD only. Previous branches seem mostly clean, so there's something
> going wrong. Spinlocks going wrong perhaps?
>
> http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole&dt=2015-05-20%2016%3A30%3A26&stg=check
> ! PANIC: stuck spinlock (c00000000d6f4140) detected at lwlock.c:816
> ! server closed the connection unexpectedly
> ! This probably means the server terminated abnormally
> ! before or while processing the request.
> ! connection to server was lost
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2015-05-09%2020%3A30%3A29
> ! PANIC: semop(id=0) failed: Result too large
> ! server closed the connection unexpectedly
> ! This probably means the server terminated abnormally
> ! before or while processing the request.
> ! connection to server was lost
>
> http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole&dt=2015-05-05%2018%3A39%3A38&stg=check
> ! FATAL: semop(id=0) failed: File too large
> ! server closed the connection unexpectedly
> ! This probably means the server terminated abnormally
> ! before or while processing the request.
> ! connection to server was lost
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2015-05-03%2012%3A30%3A18
> ! PANIC: semop(id=-1073741824) failed: Invalid argument
> ! server closed the connection unexpectedly
> ! This probably means the server terminated abnormally
> ! before or while processing the request.
> ! connection to server was lost
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2015-04-29%2004%3A30%3A25
> ! PANIC: stuck spinlock (c00000000d335360) detected at lwlock.c:767
> ! server closed the connection unexpectedly
> ! This probably means the server terminated abnormally
> ! before or while processing the request.
> ! connection to server was lost

And now:

! FATAL: semop(id=-2013265921) failed: Invalid argument
! CONTEXT: SQL statement "CREATE TEMP TABLE brin_result (cid tid)"
! PL/pgSQL function inline_code_block line 20 at SQL statement
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost

Uhm:

void
s_init_lock_sema(volatile slock_t *lock)
{
static int counter = 0;

*lock = (++counter) % NUM_SPINLOCK_SEMAPHORES;
}

One problem here might be that counter is signed. Once s_init_lock_sema
has been called often enough for counter to wrap around strange things
will happen. But - I don't see why this codepatch would even be hit
once on this platform? It's only built !HAVE_SPINLOCKS which isn't the
case on anole. So this appears to be an independent bug (9.4+).

One that has lead me to find an atomics bug (9.5+, stupid forgotten
codepath for atomics on spinlocks on semaphores) - which again should be
independent, because it's again is only relevant when spinlocks aren't
used...

I'll fix both.

But that leaves this problem.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2015-05-26 03:12:58 Re: Missing importing option of postgres_fdw
Previous Message Amit Kapila 2015-05-26 03:04:31 Re: POC: Cache data in GetSnapshotData()