Re: Solaris ecpg program doesn't work - pulling my hair

From: <wespvp(at)syntegra(dot)com>
To: Michael Meskes <meskes(at)postgresql(dot)org>
Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Solaris ecpg program doesn't work - pulling my hair
Date: 2004-03-25 14:07:04
Message-ID: BC8843A8.77C2%wespvp@syntegra.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> We had this in the past. I'm not sure and would have to search the
> archives but I vaguely remember that this has been a threading bug in
> the Solaris version. Could you please try using 7.4.2 or cvs head where
> this should be fixed. Alternatively you could try with threadding
> disabled.

I verified last night that this problem also occurs with 7.4.2. I did some
more extensive testing on the solution in my previous follow-up email. That
is definitely the problem - configure is setting "-pthread" instead of
"-lpthread" in config.status. After manually correcting this in
config.status, everything works properly.

I don't know enough about configure to know how to fix configure. It is
properly setting -lpthread on linux.

It's also not clear why the symptoms occur since the build does not abort
with an unsatisfied external. It must be picking up the pthread externals
from soemwhere else? The only difference I can se in the ldd's is the order
of the libraries. An ldd of ecpglib shows:

Good:

gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
prepare.o memory.o connect.o misc.o -L../../../../src/port
-L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
-L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -lpthread
-R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
rm -f libecpg.so.4
ln -s libecpg.so.4.1 libecpg.so.4
rm -f libecpg.so
ln -s libecpg.so.4.1 libecpg.so

% ldd libecpg.so
libpgtypes.so.1 =>
/home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
libssl.so.0.9.7 =>
/mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
libcrypto.so.0.9.7 =>
/mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
libm.so.1 => /usr/lib/libm.so.1
libpthread.so.1 => /usr/lib/libpthread.so.1
libresolv.so.2 => /usr/lib/libresolv.so.2
libsocket.so.1 => /usr/lib/libsocket.so.1
libnsl.so.1 => /usr/lib/libnsl.so.1
libdl.so.1 => /usr/lib/libdl.so.1
libc.so.1 => /usr/lib/libc.so.1
libmp.so.2 => /usr/lib/libmp.so.2
libthread.so.1 => /usr/lib/libthread.so.1
/usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1

Bad:

gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
prepare.o memory.o connect.o misc.o -L../../../../src/port
-L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
-L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -pthread
-R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
gcc: unrecognized option `-pthread'
rm -f libecpg.so.4
ln -s libecpg.so.4.1 libecpg.so.4
rm -f libecpg.so
ln -s libecpg.so.4.1 libecpg.so

% !ldd
ldd libecpg.so
libpgtypes.so.1 =>
/home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
libssl.so.0.9.7 =>
/mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
libcrypto.so.0.9.7 =>
/mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
libm.so.1 => /usr/lib/libm.so.1
libresolv.so.2 => /usr/lib/libresolv.so.2
libsocket.so.1 => /usr/lib/libsocket.so.1
libnsl.so.1 => /usr/lib/libnsl.so.1
libpthread.so.1 => /usr/lib/libpthread.so.1
libdl.so.1 => /usr/lib/libdl.so.1
libc.so.1 => /usr/lib/libc.so.1
libmp.so.2 => /usr/lib/libmp.so.2
libthread.so.1 => /usr/lib/libthread.so.1
/usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1

I realize it isn't entirely meaningful without the source code to know
exactly where I put the print statements, but here is my debug output
running the previously enclosed test program. You can see that it is
allocating a new sqlca structure when it shouldn't be.

Good:

% ./testit
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x23b98
ECPGget_sqlca: before return: address of sqlca = 0x23b98
ECPGINIT: address of sqlca = 0x23b98
In ECPGconnect
ECPGconnect: address of sqlca = 0x23b98
Before connection check
bad connection
ECPGconnect: address of sqlca = 0x23b98
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
ECPGget_sqlca: before return: address of sqlca = 0x23b98
In error.c - code = -402
ECPGraise: address of sqlca = 0x23b98
After ECPGraise, sqlca->sqlcode = -402
ECPGconnect: address of sqlca = 0x23b98
Before return false, sqlca->sqlcode = -402
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
ECPGget_sqlca: before return: address of sqlca = 0x23b98
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
ECPGget_sqlca: before return: address of sqlca = 0x23b98
Connect failure: -402

Bad:

% ./testit
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x23900
ECPGget_sqlca: before return: address of sqlca = 0x23900
ECPGINIT: address of sqlca = 0x23900
In ECPGconnect
ECPGconnect: address of sqlca = 0x23900
Before connection check
bad connection
ECPGconnect: address of sqlca = 0x23900
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x251b0
ECPGget_sqlca: before return: address of sqlca = 0x251b0
In error.c - code = -402
ECPGraise: address of sqlca = 0x251b0
After ECPGraise, sqlca->sqlcode = 0
ECPGconnect: address of sqlca = 0x23900
Before return false, sqlca->sqlcode = 0
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x25248
ECPGget_sqlca: before return: address of sqlca = 0x25248
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x252e0
ECPGget_sqlca: before return: address of sqlca = 0x252e0
ECPGINIT: address of sqlca = 0x252e0
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x25378
ECPGget_sqlca: before return: address of sqlca = 0x25378
In error.c - code = -220
ECPGraise: address of sqlca = 0x25378
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x25410
ECPGget_sqlca: before return: address of sqlca = 0x25410
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x254a8
ECPGget_sqlca: before return: address of sqlca = 0x254a8
ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
ECPGINIT: address of sqlca = 0x25540
ECPGget_sqlca: before return: address of sqlca = 0x25540
SELECT error code: 0
systemNum = -4261248

I just got this in response to a post to pgsql-general on a different
Solaris problem. This sounds like the same problem as I'm seeing. I've
sent him my solution. Hopefully it will solve his symptoms also.

>> One other problem I am looking into (and why I tried to compile with
>> thread safety in the first place) is that this somehow did not turn on
>> -D_REENTRANT in the CFLAGS for libpq. And that leads to libpq not using
>> the threadsafe definition of errno, leading to serious communication
>> trouble in the end (pqReadData() failing with ENOENT while the real
>> error is a harmless EAGAIN from a nonblocking recv()).
>>
>>
>> Jan

Wes

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Meskes 2004-03-25 14:17:59 Re: Solaris ecpg program doesn't work - pulling my hair out!
Previous Message wespvp 2004-03-25 13:57:26 Re: 7.4.2 on Solaris 9 - Error