Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)
Date: 2013-01-25 00:39:27
Message-ID: 5101D43F.90102@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 25/01/13 13:06, Tom Lane wrote:
> Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> writes:
>> If I have done this right, then this is the trace for the 1st message...
>> from my wandering through the calls here it looks like a normal commit,
>> and something goes a bit weird as SI messages are being processed...
>
> Seems like the critical bit is here:
>
>> #11 0x00007f4e2a53d985 in exit () from /lib/x86_64-linux-gnu/libc.so.6
>> #12 0x00007f4e272b951a in ?? () from /usr/lib/libR.so
>> #13 <signal handler called>
>> #14 0x00007f4e2a538707 in kill () from /lib/x86_64-linux-gnu/libc.so.6
>> #15 0x00000000006152e5 in SICleanupQueue (
>> callerHasWriteLock=callerHasWriteLock(at)entry=1 '\001',
>> minFree=minFree(at)entry=4) at sinvaladt.c:672
>
> Frame 15 is definitely SICleanupQueue trying to send a catchup SIGUSR1
> interrupt to the furthest-behind backend. The fact that we go directly
> into a signal handler from the kill() suggests that the furthest-behind
> backend is actually *this* backend, which perhaps is a bit surprising,
> but it's supposed to work. What it looks like, though, is that libR has
> commandeered the SIGUSR1 signal handler, and just to be extra special
> unfriendly to the surrounding program, it does an exit() when it traps a
> SIGUSR1.
>
> Unless libR can be coerced into not screwing up our signal handlers,
> I'd say that PL/R is broken beyond repair. That would be unfortunate.
>
> regards, tom lane

It looks like Joe has run into something similar with libR stealing
SIGINT, he reinstalls it. A simple patch along the same lines for
SIGUSR1 (attached) seems to fix the issue.

I wonder if we need to install *all* the remaining signal handlers too?

Cheers

Mark

Attachment Content-Type Size
plr.c.diff text/x-patch 363 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2013-01-25 00:49:20 Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)
Previous Message Andres Freund 2013-01-25 00:10:57 Re: PL/R Median Busts Commit (Postgres 9.1.6 + plr 8.3.0.13 on Ubuntu 12.10 64 bit)