Re: Solaris ecpg program doesn't work - pulling my hair

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: wespvp(at)syntegra(dot)com, Michael Meskes <meskes(at)postgresql(dot)org>, PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Solaris ecpg program doesn't work - pulling my hair
Date: 2004-06-10 02:34:45
Message-ID: 200406100234.i5A2Yj826547@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Jan, is this fixed in current CVS and 7.4.X CVS?

---------------------------------------------------------------------------

Jan Wieck wrote:
> wespvp(at)syntegra(dot)com wrote:
>
> >> We had this in the past. I'm not sure and would have to search the
> >> archives but I vaguely remember that this has been a threading bug in
> >> the Solaris version. Could you please try using 7.4.2 or cvs head where
> >> this should be fixed. Alternatively you could try with threadding
> >> disabled.
> >
> > I verified last night that this problem also occurs with 7.4.2. I did some
> > more extensive testing on the solution in my previous follow-up email. That
> > is definitely the problem - configure is setting "-pthread" instead of
> > "-lpthread" in config.status. After manually correcting this in
> > config.status, everything works properly.
>
> As stated before, this is not true. If you don't compile with
> -D_REENTRANT, the /usr/include/errno.h declared errno as
>
> extern int errno;
>
> instead of the thread safe
>
> extern int *___errno();
> #define errno *(___errno())
>
> At least it does so here on Solaris 8. That leads to libpq using the
> global errno variable, which might or might not be the one where "your"
> error is in a multithreaded program. I mailed the correct solution as a
> follow up to the other thread earlier today as a patch against 7.4.2.
>
> >
> > I don't know enough about configure to know how to fix configure. It is
> > properly setting -lpthread on linux.
>
> Just linking against the right libraries does not do it here. Solaris is
> not Linux.
>
>
> Jan
>
> >
> >
> > It's also not clear why the symptoms occur since the build does not abort
> > with an unsatisfied external. It must be picking up the pthread externals
> > from soemwhere else? The only difference I can se in the ldd's is the order
> > of the libraries. An ldd of ecpglib shows:
> >
> > Good:
> >
> > gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
> > prepare.o memory.o connect.o misc.o -L../../../../src/port
> > -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
> > -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -lpthread
> > -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
> > rm -f libecpg.so.4
> > ln -s libecpg.so.4.1 libecpg.so.4
> > rm -f libecpg.so
> > ln -s libecpg.so.4.1 libecpg.so
> >
> > % ldd libecpg.so
> > libpgtypes.so.1 =>
> > /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
> > libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
> > libssl.so.0.9.7 =>
> > /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
> > libcrypto.so.0.9.7 =>
> > /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
> > libm.so.1 => /usr/lib/libm.so.1
> > libpthread.so.1 => /usr/lib/libpthread.so.1
> > libresolv.so.2 => /usr/lib/libresolv.so.2
> > libsocket.so.1 => /usr/lib/libsocket.so.1
> > libnsl.so.1 => /usr/lib/libnsl.so.1
> > libdl.so.1 => /usr/lib/libdl.so.1
> > libc.so.1 => /usr/lib/libc.so.1
> > libmp.so.2 => /usr/lib/libmp.so.2
> > libthread.so.1 => /usr/lib/libthread.so.1
> > /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
> >
> > Bad:
> >
> > gcc -shared -h libecpg.so.4 execute.o typename.o descriptor.o data.o error.o
> > prepare.o memory.o connect.o misc.o -L../../../../src/port
> > -L/mhinteg/trees/4/sun32_fixes/ported/openssl -L../pgtypeslib -lpgtypes
> > -L../../../../src/interfaces/libpq -lpq -lssl -lcrypto -lm -pthread
> > -R/home/wrp/local/pgsql.7.4.2/lib -o libecpg.so.4.1
> > gcc: unrecognized option `-pthread'
> > rm -f libecpg.so.4
> > ln -s libecpg.so.4.1 libecpg.so.4
> > rm -f libecpg.so
> > ln -s libecpg.so.4.1 libecpg.so
> >
> > % !ldd
> > ldd libecpg.so
> > libpgtypes.so.1 =>
> > /home/wrp/local/pgsql.7.4.2/lib/libpgtypes.so.1
> > libpq.so.3 => /home/wrp/local/pgsql.7.4.2/lib/libpq.so.3
> > libssl.so.0.9.7 =>
> > /mhinteg/trees/4/sun32_fixes/ported/openssl/libssl.so.0.9.7
> > libcrypto.so.0.9.7 =>
> > /mhinteg/trees/4/sun32_fixes/ported/openssl/libcrypto.so.0.9.7
> > libm.so.1 => /usr/lib/libm.so.1
> > libresolv.so.2 => /usr/lib/libresolv.so.2
> > libsocket.so.1 => /usr/lib/libsocket.so.1
> > libnsl.so.1 => /usr/lib/libnsl.so.1
> > libpthread.so.1 => /usr/lib/libpthread.so.1
> > libdl.so.1 => /usr/lib/libdl.so.1
> > libc.so.1 => /usr/lib/libc.so.1
> > libmp.so.2 => /usr/lib/libmp.so.2
> > libthread.so.1 => /usr/lib/libthread.so.1
> > /usr/platform/SUNW,Ultra-Enterprise/lib/libc_psr.so.1
> >
> >
> >
> > I realize it isn't entirely meaningful without the source code to know
> > exactly where I put the print statements, but here is my debug output
> > running the previously enclosed test program. You can see that it is
> > allocating a new sqlca structure when it shouldn't be.
> >
> >
> > Good:
> >
> >
> > % ./testit
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x23b98
> > ECPGget_sqlca: before return: address of sqlca = 0x23b98
> > ECPGINIT: address of sqlca = 0x23b98
> > In ECPGconnect
> > ECPGconnect: address of sqlca = 0x23b98
> > Before connection check
> > bad connection
> > ECPGconnect: address of sqlca = 0x23b98
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> > ECPGget_sqlca: before return: address of sqlca = 0x23b98
> > In error.c - code = -402
> > ECPGraise: address of sqlca = 0x23b98
> > After ECPGraise, sqlca->sqlcode = -402
> > ECPGconnect: address of sqlca = 0x23b98
> > Before return false, sqlca->sqlcode = -402
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> > ECPGget_sqlca: before return: address of sqlca = 0x23b98
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x23b98
> > ECPGget_sqlca: before return: address of sqlca = 0x23b98
> > Connect failure: -402
> >
> >
> >
> > Bad:
> >
> >
> > % ./testit
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x23900
> > ECPGget_sqlca: before return: address of sqlca = 0x23900
> > ECPGINIT: address of sqlca = 0x23900
> > In ECPGconnect
> > ECPGconnect: address of sqlca = 0x23900
> > Before connection check
> > bad connection
> > ECPGconnect: address of sqlca = 0x23900
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x251b0
> > ECPGget_sqlca: before return: address of sqlca = 0x251b0
> > In error.c - code = -402
> > ECPGraise: address of sqlca = 0x251b0
> > After ECPGraise, sqlca->sqlcode = 0
> > ECPGconnect: address of sqlca = 0x23900
> > Before return false, sqlca->sqlcode = 0
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x25248
> > ECPGget_sqlca: before return: address of sqlca = 0x25248
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x252e0
> > ECPGget_sqlca: before return: address of sqlca = 0x252e0
> > ECPGINIT: address of sqlca = 0x252e0
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x25378
> > ECPGget_sqlca: before return: address of sqlca = 0x25378
> > In error.c - code = -220
> > ECPGraise: address of sqlca = 0x25378
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x25410
> > ECPGget_sqlca: before return: address of sqlca = 0x25410
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x254a8
> > ECPGget_sqlca: before return: address of sqlca = 0x254a8
> > ECPGget_sqlca: after pthread_getspecific: address of sqlca = 0x0
> > ECPGINIT: address of sqlca = 0x25540
> > ECPGget_sqlca: before return: address of sqlca = 0x25540
> > SELECT error code: 0
> > systemNum = -4261248
> >
> > I just got this in response to a post to pgsql-general on a different
> > Solaris problem. This sounds like the same problem as I'm seeing. I've
> > sent him my solution. Hopefully it will solve his symptoms also.
> >
> >>> One other problem I am looking into (and why I tried to compile with
> >>> thread safety in the first place) is that this somehow did not turn on
> >>> -D_REENTRANT in the CFLAGS for libpq. And that leads to libpq not using
> >>> the threadsafe definition of errno, leading to serious communication
> >>> trouble in the end (pqReadData() failing with ENOENT while the real
> >>> error is a harmless EAGAIN from a nonblocking recv()).
> >>>
> >>>
> >>> Jan
> >
> >
> > Wes
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 8: explain analyze is your friend
>
>
> --
> #======================================================================#
> # It's easier to get forgiveness for being wrong than for being right. #
> # Let's break this rule - forgive me. #
> #================================================== JanWieck(at)Yahoo(dot)com #
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2004-06-10 02:39:32 Re: pg_dump and schema namespace notes
Previous Message jao 2004-06-09 23:36:23 Re: Postgresql vs. aggregates