Re: castoroides spinlock failure on test_shm_mq

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: castoroides spinlock failure on test_shm_mq
Date: 2015-06-21 19:39:58
Message-ID: 20150621193958.GB4797@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-06-20 09:35:39 -0400, Robert Haas wrote:
> On Sat, Jun 20, 2015 at 12:24 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Has anybody noticed the way castoroides is randomly failing?
> >
> > SELECT test_shm_mq_pipelined(16384, (select string_agg(chr(32+(random()*95)::int), '') from generate_series(1,270000)), 200, 3);
> > ! PANIC: stuck spinlock (100cb92f4) detected at atomics.c:30
> > ! server closed the connection unexpectedly
> > ! This probably means the server terminated abnormally
> > ! before or while processing the request.
> > ! connection to server was lost
>
> Yeah, Andres and I discussed it a month ago:
>
> http://www.postgresql.org/message-id/20150527225528.GP5310@alap3.anarazel.de
>
> I think we're going to need to try to implement real memory barriers
> on all architectures we support. It's not clear whether there's some
> suitable generic fallback that we could use or whether we're going to
> need something different for each case. I had thought Andres was
> planning to work on this.

I am. I'd posted on the other thread that I want to use
waitpid(PostmasterPid, WNOHANG) as the fallback for now. Unless somebody
protests I'm going to commit that first, wait for a while to see wether
it stabilizes the solaris members, and then commit a better fallback for
solaris with suncc.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-06-21 20:22:28 Re: pgbench - allow backslash-continuations in custom scripts
Previous Message Andres Freund 2015-06-21 19:24:09 Rework the way multixact truncations work