Re: PL/Python fails on new NetBSD/PPC 8.0 install

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Benjamin Scherrey <scherrey(at)proteus-tech(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: PL/Python fails on new NetBSD/PPC 8.0 install
Date: 2019-10-29 20:25:16
Message-ID: 4344.1572380716@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Benjamin Scherrey <scherrey(at)proteus-tech(dot)com> writes:
> None of the output provides any clue to me but I do know that Python 3.7
> has some issues with a lot of versions of openssl that is based on a
> disagreement between devs in both projects. This was a problem for me when
> trying to build python 3.7 on my Kubuntu 14.04 system. I've seen this issue
> reported across all targets for Python including Freebsd so I expect it's
> likely to also happen for NetBSD.

Thanks for looking! It doesn't seem to be related to this issue though.
I've now tracked this problem down, and what I'm finding is that:

1. The proximate cause of the crash is that pthread_self() is
returning ((pthread_t) -1), which Python interprets as a hard
failure. Now on the one hand, I wonder why Python is even
checking for a failure, given that POSIX is totally clear that
there are no failures:

The pthread_self() function shall always be successful and no
return value is reserved to indicate an error.

"Shall" does not allow wiggle room. But on the other hand,
pthread_t is a pointer on this platform, so that's a pretty
strange value to be returning if it's valid.

And on the third hand, NetBSD's own man page for pthread_self()
doesn't admit the possibility of failure either, though it does
suggest that you should link with -lpthread [1].

2. Testing pthread_self() standalone on this platform provides
illuminating results:

$ cat test.c
#include <stdio.h>
#include <pthread.h>

int main()
{
pthread_t id = pthread_self();

printf("self = %p\n", id);
return 0;
}
$ gcc test.c
$ ./a.out
self = 0xffffffffffffffff
$ gcc test.c -lpthread
$ ./a.out
self = 0x754ae5a2b800

3. libpython.so on this platform has a dependency on libpthread,
but we don't link the postgres executable to libpthread. I surmise
that pthread_self() actually exists in core libc, but what it returns
is only valid if libpthread was linked into the main executable so
that it could initialize some static state at execution start.

4. If I add -lpthread to the LIBS for the main postgres executable,
PL/Python starts passing its regression tests. I haven't finished
a complete check-world run, but at least the core regression tests
show no ill effects from doing this.

So one possible answer for us is "if we're on NetBSD and plpython3
is to be built, add -lpthread to the core LIBS list". I do not
much like this answer though; it's putting the responsibility in
the wrong place.

What I'm inclined to do is go file a bug report saying that this
behavior contradicts both POSIX and NetBSD's own man page, and
see what they say about that.

regards, tom lane

[1] https://netbsd.gw.com/cgi-bin/man-cgi?pthread_self+3+NetBSD-current

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-10-29 20:31:11 Re: RFC: split OBJS lines to one object per line
Previous Message Peter Geoghegan 2019-10-29 20:16:49 Re: RFC: split OBJS lines to one object per line