Re: Solaris ecpg program doesn't work - pulling my hair

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: wespvp(at)syntegra(dot)com
Cc: Michael Meskes <meskes(at)postgresql(dot)org>, PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Solaris ecpg program doesn't work - pulling my hair
Date: 2004-03-25 19:42:02
Message-ID: 4063360A.3040103@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

wespvp(at)syntegra(dot)com wrote:

>> We had this in the past. I'm not sure and would have to search the
>> archives but I vaguely remember that this has been a threading bug in
>> the Solaris version. Could you please try using 7.4.2 or cvs head where
>> this should be fixed. Alternatively you could try with threadding
>> disabled.
>
> I verified last night that this problem also occurs with 7.4.2. I did some
> more extensive testing on the solution in my previous follow-up email. That
> is definitely the problem - configure is setting "-pthread" instead of
> "-lpthread" in config.status. After manually correcting this in
> config.status, everything works properly.

As stated before, this is not true. If you don't compile with
-D_REENTRANT, the /usr/include/errno.h declared errno as

extern int errno;

instead of the thread safe

extern int *___errno();
#define errno *(___errno())

At least it does so here on Solaris 8. That leads to libpq using the
global errno variable, which might or might not be the one where "your"
error is in a multithreaded program. I mailed the correct solution as a
follow up to the other thread earlier today as a patch against 7.4.2.

>
> I don't know enough about configure to know how to fix configure. It is
> properly setting -lpthread on linux.

Just linking against the right libraries does not do it here. Solaris is
not Linux.

Jan

>
>
> It's also not clear why the symptoms occur since the build does not abort
> with an unsatisfied external. It must be picking up the pthread externals
> from soemwhere else? The only difference I can se in the ldd's is the order
> of the libraries. An ldd of ecpglib shows:
>
> Good:
>
> gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
> prepare.o memory.o connect.o misc.o -L../../../../src/port
> -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
> -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -lpthread
> -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
> rm -f libecpg.so.4
> ln -s libecpg.so.4.1 libecpg.so.4
> rm -f libecpg.so
> ln -s libecpg.so.4.1 libecpg.so
>
> % ldd libecpg.so
> libpgtypes.so.1 =>
> /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
> libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
> libssl.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
> libcrypto.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
> libm.so.1 => /usr/lib/libm.so.1
> libpthread.so.1 => /usr/lib/libpthread.so.1
> libresolv.so.2 => /usr/lib/libresolv.so.2
> libsocket.so.1 => /usr/lib/libsocket.so.1
> libnsl.so.1 => /usr/lib/libnsl.so.1
> libdl.so.1 => /usr/lib/libdl.so.1
> libc.so.1 => /usr/lib/libc.so.1
> libmp.so.2 => /usr/lib/libmp.so.2
> libthread.so.1 => /usr/lib/libthread.so.1
> /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
>
> Bad:
>
> gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
> prepare.o memory.o connect.o misc.o -L../../../../src/port
> -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
> -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -pthread
> -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
> gcc: unrecognized option `-pthread'
> rm -f libecpg.so.4
> ln -s libecpg.so.4.1 libecpg.so.4
> rm -f libecpg.so
> ln -s libecpg.so.4.1 libecpg.so
>
> % !ldd
> ldd libecpg.so
> libpgtypes.so.1 =>
> /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
> libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
> libssl.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
> libcrypto.so.0.9.7 =>
> /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
> libm.so.1 => /usr/lib/libm.so.1
> libresolv.so.2 => /usr/lib/libresolv.so.2
> libsocket.so.1 => /usr/lib/libsocket.so.1
> libnsl.so.1 => /usr/lib/libnsl.so.1
> libpthread.so.1 => /usr/lib/libpthread.so.1
> libdl.so.1 => /usr/lib/libdl.so.1
> libc.so.1 => /usr/lib/libc.so.1
> libmp.so.2 => /usr/lib/libmp.so.2
> libthread.so.1 => /usr/lib/libthread.so.1
> /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
>
>
>
> I realize it isn't entirely meaningful without the source code to know
> exactly where I put the print statements, but here is my debug output
> running the previously enclosed test program. You can see that it is
> allocating a new sqlca structure when it shouldn't be.
>
>
> Good:
>
>
> % ./testit
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> ECPGINIT: address of sqlca = 0x23b98
> In ECPGconnect
> ECPGconnect: address of sqlca = 0x23b98
> Before connection check
> bad connection
> ECPGconnect: address of sqlca = 0x23b98
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> In error.c - code = -402
> ECPGraise: address of sqlca = 0x23b98
> After ECPGraise, sqlca->sqlcode = -402
> ECPGconnect: address of sqlca = 0x23b98
> Before return false, sqlca->sqlcode = -402
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> ECPGget_sqlca: before return: address of sqlca = 0x23b98
> Connect failure: -402
>
>
>
> Bad:
>
>
> % ./testit
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x23900
> ECPGget_sqlca: before return: address of sqlca = 0x23900
> ECPGINIT: address of sqlca = 0x23900
> In ECPGconnect
> ECPGconnect: address of sqlca = 0x23900
> Before connection check
> bad connection
> ECPGconnect: address of sqlca = 0x23900
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x251b0
> ECPGget_sqlca: before return: address of sqlca = 0x251b0
> In error.c - code = -402
> ECPGraise: address of sqlca = 0x251b0
> After ECPGraise, sqlca->sqlcode = 0
> ECPGconnect: address of sqlca = 0x23900
> Before return false, sqlca->sqlcode = 0
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25248
> ECPGget_sqlca: before return: address of sqlca = 0x25248
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x252e0
> ECPGget_sqlca: before return: address of sqlca = 0x252e0
> ECPGINIT: address of sqlca = 0x252e0
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25378
> ECPGget_sqlca: before return: address of sqlca = 0x25378
> In error.c - code = -220
> ECPGraise: address of sqlca = 0x25378
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25410
> ECPGget_sqlca: before return: address of sqlca = 0x25410
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x254a8
> ECPGget_sqlca: before return: address of sqlca = 0x254a8
> ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> ECPGINIT: address of sqlca = 0x25540
> ECPGget_sqlca: before return: address of sqlca = 0x25540
> SELECT error code: 0
> systemNum = -4261248
>
> I just got this in response to a post to pgsql-general on a different
> Solaris problem. This sounds like the same problem as I'm seeing. I've
> sent him my solution. Hopefully it will solve his symptoms also.
>
>>> One other problem I am looking into (and why I tried to compile with
>>> thread safety in the first place) is that this somehow did not turn on
>>> -D_REENTRANT in the CFLAGS for libpq. And that leads to libpq not using
>>> the threadsafe definition of errno, leading to serious communication
>>> trouble in the end (pqReadData() failing with ENOENT while the real
>>> error is a harmless EAGAIN from a nonblocking recv()).
>>>
>>>
>>> Jan
>
>
> Wes
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rob Hoopman 2004-03-25 19:56:35 Re: self referencing tables/ nested sets etc...
Previous Message Azeem M. Suleman 2004-03-25 19:33:22 Geodata type...