RTLD_LAZY considered harmful (Re: pltlc and pltlcu problems)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brent Verner <brent(at)rcfile(dot)org>
Cc: Murray Prior Hobbs <murray(at)efone(dot)com>, Lamar Owen <lamar(dot)owen(at)wgcr(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: RTLD_LAZY considered harmful (Re: pltlc and pltlcu problems)
Date: 2002-01-20 18:40:17
Message-ID: 4640.1011552017@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Brent Verner <brent(at)rcfile(dot)org> writes:
> Can someone verify that pltcl works on
> their stock redhat 7.2 system?

Indeed it does not. On a straight-from-the-CD RH 7.2 install and
CVS-tip Postgres, I see both of the behaviors Murray complained of.

What I think is particularly nasty is that we get an exit(127) when
the symbol resolution fails, leading to database restart. This will
probably happen on *most* systems not only Linux, because we are
specifying RTLD_LAZY in our dlopen() calls, meaning that missing
symbols should be flagged when they are referenced at runtime --- and
if we call a function that should be there and isn't, there's not much
the dynamic loader can do except throw a signal or exit().

What we should be doing is specifying RTLD_NOW to dlopen(), so that
any unresolved symbol failure occurs during dlopen(), when we are
prepared to deal with it in a clean fashion.

I ran into this same behavior years ago on HPUX and fixed it by using
what they call BIND_IMMEDIATE mode; but I now see that most of the
other ports are specifying RTLD_LAZY, and thus have this problem.

Unless I hear a credible counter-argument, I am going to change
RTLD_LAZY to RTLD_NOW in src/backend/port/dynloader/linux.h. I have
tested that and it produces a clean error with no backend crash.

What I would *like* to do is make the same change in all the
port/dynloader files that reference RTLD_LAZY:
src/backend/port/dynloader/aix.h
src/backend/port/dynloader/bsdi.h
src/backend/port/dynloader/dgux.h
src/backend/port/dynloader/freebsd.h
src/backend/port/dynloader/irix5.h
src/backend/port/dynloader/linux.h
src/backend/port/dynloader/netbsd.h
src/backend/port/dynloader/openbsd.h
src/backend/port/dynloader/osf.h
src/backend/port/dynloader/sco.h
src/backend/port/dynloader/solaris.h
src/backend/port/dynloader/svr4.h
src/backend/port/dynloader/univel.h
src/backend/port/dynloader/unixware.h
src/backend/port/dynloader/win.h
However I'm a bit scared to do that at this late stage of the release
cycle, because perhaps some of these platforms don't support the full
dlopen() API. Comments? Can anyone test whether RTLD_NOW works on
any of the above-mentioned ports?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-01-20 19:10:53 Re: RTLD_LAZY considered harmful (Re: pltlc and pltlcu problems)
Previous Message Tom Lane 2002-01-20 17:35:35 Re: pltlc and pltlcu problems

Browse pgsql-sql by date

  From Date Subject
Next Message Bruce Momjian 2002-01-20 19:10:53 Re: RTLD_LAZY considered harmful (Re: pltlc and pltlcu problems)
Previous Message Tom Lane 2002-01-20 17:35:35 Re: pltlc and pltlcu problems