Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, pgsql-bugs(at)postgresql(dot)org, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)
Date: 2013-01-25 00:10:57
Message-ID: 20130125001057.GB18817@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2013-01-24 19:06:21 -0500, Tom Lane wrote:
> Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> writes:
> > If I have done this right, then this is the trace for the 1st message...
> > from my wandering through the calls here it looks like a normal commit,
> > and something goes a bit weird as SI messages are being processed...
>
> Seems like the critical bit is here:
>
> > #11 0x00007f4e2a53d985 in exit () from /lib/x86_64-linux-gnu/libc.so.6
> > #12 0x00007f4e272b951a in ?? () from /usr/lib/libR.so
> > #13 <signal handler called>
> > #14 0x00007f4e2a538707 in kill () from /lib/x86_64-linux-gnu/libc.so.6
> > #15 0x00000000006152e5 in SICleanupQueue (
> > callerHasWriteLock=callerHasWriteLock(at)entry=1 '\001',
> > minFree=minFree(at)entry=4) at sinvaladt.c:672
>
> Frame 15 is definitely SICleanupQueue trying to send a catchup SIGUSR1
> interrupt to the furthest-behind backend. The fact that we go directly
> into a signal handler from the kill() suggests that the furthest-behind
> backend is actually *this* backend, which perhaps is a bit surprising,
> but it's supposed to work. What it looks like, though, is that libR has
> commandeered the SIGUSR1 signal handler, and just to be extra special
> unfriendly to the surrounding program, it does an exit() when it traps a
> SIGUSR1.
>
> Unless libR can be coerced into not screwing up our signal handlers,
> I'd say that PL/R is broken beyond repair. That would be unfortunate.

I wonder whether we could place some Assert's somewhere that verify our
signal handlers are still setup, this isn't the first bug caused by pl's
or libraries overriding them...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Mark Kirkwood 2013-01-25 00:39:27 Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)
Previous Message Tom Lane 2013-01-25 00:06:21 Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)