Skip site navigation (1) Skip section navigation (2)

Re: BUG #2246: Bad malloc interactions: ecpg, openssl

From: Andrew Klosterman <andrew5(at)ece(dot)cmu(dot)edu>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #2246: Bad malloc interactions: ecpg, openssl
Date: 2006-02-14 21:35:28
Message-ID: Pine.LNX.4.53L-ECE.CMU.EDU.0602141627320.29413@blossom.pdl.cmu.edu (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-patches
On Tue, 14 Feb 2006, Andrew Klosterman wrote:

> On Mon, 13 Feb 2006, Stephen Frost wrote:
>
> > Hmm, alright, well, this is at least not the fault of the patch of mine
> > which was included in Debian's 8.1.2-2 Postgres release. :)  You might
> > try compiling some debs with debugging enabled.  This is (reasonably)
> > straight-forward:
> >
> > (as root:)
> > aptitude install build-essential debhelper cdbs bison perl libperl-dev \
> > 	tk8.4-dev flex libreadline5-dev libssl-dev zlib1g-dev \
> > 	libpam0g-dev libxml2-dev libkrb5-dev libxslt1-dev python-dev \
> > 	gettext bzip2 fakeroot
> > (as user:)
> > apt-get source postgresql-8.1
> > cd postgresql-8.1-8.1.0
> > export DEB_BUILD_OPTIONS="nostrip"
> > dpkg-buildpackage -uc -us -rfakeroot
> >
> > Should produce .debs in the parent directory which have debugging
> > information.  Another useful build option is "noopt", ie:
> > export DEB_BUILD_OPTIONS="nostrip noopt", though that could make the
> > error go disappear.  It'd be terribly nice if you could do this and
> > provide a gdb backtrace with debugging... :)
> >
> > 	Thanks,
> >
> > 		Stephen
>
> Alright, I have built a system with the symbols left into the binaries.
>
> It still crashes with the "corrupted double-linked list" error.
>
> Running with ElectricFence the backtrace I get is:
>
>   Electric Fence 2.1 Copyright (C) 1987-1998 Bruce Perens.
>
> ElectricFence Aborting: Allocating 0 bytes, probably a bug.
>
> Program received signal SIGILL, Illegal instruction.
> [Switching to Thread 16384 (LWP 1895)]
> 0x401c4851 in kill () from /lib/libc.so.6
> (gdb) bt
> #0  0x401c4851 in kill () from /lib/libc.so.6
> #1  0x40037dd5 in EF_Abort () from /usr/lib/libefence.so.0
> #2  0x40037823 in memalign () from /usr/lib/libefence.so.0
> #3  0x400379ad in malloc () from /usr/lib/libefence.so.0
> #4  0x40037a10 in calloc () from /usr/lib/libefence.so.0
> #5  0x404a282f in krb5_set_default_tgs_ktypes () from /usr/lib/libkrb5.so.3
> #6  0x402c9b26 in pg_krb5_init (PQerrormsg=0x0) at fe-auth.c:119
> #7  0x402ca304 in pg_fe_getauthname (PQerrormsg=0xbffff29c "l\031")
>     at fe-auth.c:176
> #8  0x402cc861 in conninfo_parse (conninfo=<value optimized out>,
>     errorMessage=0x4057afe8) at fe-connect.c:2719
> #9  0x402cc983 in connectOptions1 (conn=0x4057acdc, conninfo=0x0)
>     at fe-connect.c:362
> #10 0x402cda11 in PQsetdbLogin (pghost=0x40574ffc "nc3", pgport=0x0,
>     pgoptions=0x0, pgtty=0x0, dbName=0x40576ff8 "andrew5",
>     login=0xbffffc31 "andrew5", pwd=0xbffffc3c "testbed") at fe-connect.c:568
> #11 0x40030fe7 in ECPGconnect (lineno=191, c=0, name=0xbffffc22 "andrew5(at)nc3",
>     user=0xbffffc31 "andrew5", passwd=0x0,
>     connection_name=0xbffff8b0 "CorrectnessCheck", autocommit=0)
>     at connect.c:452
> #12 0x08049ecb in DBConnect (arg_connection=0xbffff964 "CorrectnessCheck")
>     at client_test.pgcc:191
> #13 0x0804a14f in DoCorrectnessChecks () at client_test.pgcc:231
> #14 0x0804aa08 in main (argc=9, argv=0xbffffa74) at client_test.pgcc:526
>
> Again, it is showing a bad malloc in what appears to be some code using
> kerberos.  But there's nothing in my setup that I can think of right now
> that should induce a connection to be set up using kerberos.
>
> --Andrew J. Klosterman
> andrew5(at)ece(dot)cmu(dot)edu

With the debug binaries, I was able to step through the program and get to
what appears to be the function where it bails:  line 1166 of
postgresql-8.1.0/src/interfaces/libpq/fe-secure.c where SSL_free() is
called.

Included below is a copy&paste of my GDB session.  Within the function
that calls SSL_free(), being close_SSL(PGconn *conn), I inserted a
breakpoint.  The value of *conn is printed out, which will hopefully
assist in any debugging...

(gdb) break fe-secure.c:1162
No source file named fe-secure.c.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (fe-secure.c:1162) pending.
(gdb) set args -t andrew5(at)nc3 -u andrew5 -p testbed -i 10
(gdb) run
Starting program:
/.amd/flush/home/andrew5/projects/CVS-controlled/users/andrew5/thesis/code/database/metadata_server/test/client_test
-t andrew5(at)nc3 -u andrew5 -p testbed -i 10
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 2103)]
Breakpoint 2 at 0x402d4bc0: file fe-secure.c, line 1162.
Pending breakpoint "fe-secure.c:1162" resolved
[Switching to Thread 16384 (LWP 2103)]

Breakpoint 2, close_SSL (conn=0x8059d00) at fe-secure.c:1162
1162    {
Current language:  auto; currently c
(gdb) bt
#0  close_SSL (conn=0x8059d00) at fe-secure.c:1162
#1  0x402c6938 in closePGconn (conn=0x8059d00) at fe-connect.c:1976
#2  0x402c6a55 in PQfinish (conn=0x8059d00) at fe-connect.c:2021
#3  0x400308f9 in ecpg_finish (act=0x8059ca8) at connect.c:122
#4  0x40031707 in ECPGdisconnect (lineno=134585600,
    connection_name=0xbffff8a8 "CorrectnessCheck") at connect.c:540
#5  0x0804a036 in DBDisconnect (arg_connection=0xbffff954
"CorrectnessCheck")
    at client_test.pgcc:218
#6  0x0804a58a in DoCorrectnessChecks () at client_test.pgcc:282
#7  0x0804a9f8 in main (argc=9, argv=0xbffffa64) at client_test.pgcc:528
(gdb) list
1157    /*
1158     *      Close SSL connection.
1159     */
1160    static void
1161    close_SSL(PGconn *conn)
1162    {
1163            if (conn->ssl)
1164            {
1165                    SSL_shutdown(conn->ssl);
1166                    SSL_free(conn->ssl);
(gdb) print *conn
$1 = {pghost = 0x80634c0 "nc3", pghostaddr = 0x0, pgport = 0x80634d0
"5432",
  pgunixsocket = 0x0, pgtty = 0x80634e0 "", connect_timeout = 0x0,
  pgoptions = 0x80634f0 "", dbName = 0x80634b0 "andrew5",
  pguser = 0x8063500 "andrew5", pgpass = 0x80634a0 "testbed",
  sslmode = 0x8063510 "prefer", krbsrvname = 0x8063520 "postgres",
  Pfdebug = 0x0, noticeHooks = {noticeRec = 0x40030bd0
<ECPGnoticeReceiver>,
    noticeRecArg = 0x8059ca8,
    noticeProc = 0x402c90c0 <defaultNoticeProcessor>, noticeProcArg =
0x0},
  status = CONNECTION_OK, asyncStatus = PGASYNC_IDLE,
  xactStatus = PQTRANS_IDLE, queryclass = PGQUERY_SIMPLE,
  nonblocking = 0 '\0', copy_is_binary = 0 '\0', copy_already_done = 0,
  notifyHead = 0x0, notifyTail = 0x0, sock = 3, laddr = {addr = {
      ss_family = 2, __ss_align = 92410796,
      __ss_padding = '\0' <repeats 119 times>}, salen = 16}, raddr = {addr
= {
      ss_family = 2, __ss_align = 58856364,
      __ss_padding = '\0' <repeats 119 times>}, salen = 16},
  pversion = 196608, sversion = 80100, addrlist = 0x0, addr_cur = 0x0,
  addrlist_family = 0, setenv_state = SETENV_STATE_IDLE, next_eo = 0x0,
  be_pid = 28824, be_key = 583752927, md5Salt = "\000\000\000",
  cryptSalt = "\000", pstatus = 0x807c330, client_encoding = 8,
  verbosity = PQERRORS_DEFAULT, lobjfuncs = 0x0, inBuffer = 0x805a028 "C",
  inBufSize = 16384, inStart = 18, inCursor = 18, inEnd = 18,
  outBuffer = 0x805e030 "X", outBufSize = 16384, outCount = 0,
---Type <return> to continue, or q <return> to quit---
  outMsgStart = 1, outMsgEnd = 5, result = 0x0, curTuple = 0x0,
  allow_ssl_try = 1 '\001', wait_ssl_try = 0 '\0', ssl = 0x806d1d0,
  peer = 0x807e430,
  peer_dn =
"/C=US/ST=Pennsylvania/L=Pittsburgh/O=CMU/PDL/OU=andrew5/CN=nc3.pdl.cmu.local/emailAddress=andrew5(at)mailinator(dot)com",
'\0' <repeats 144 times>,
  peer_cn = "nc3.pdl.cmu.local", '\0' <repeats 15 times>, errorMessage = {
    data = 0x8062038 "", len = 0, maxlen = 256}, workBuffer = {
    data = 0x8062140 "COMMIT", len = 6, maxlen = 256}}
(gdb) s
1163            if (conn->ssl)
(gdb) s
1162    {
(gdb) s
1163            if (conn->ssl)
(gdb) s
1165                    SSL_shutdown(conn->ssl);
(gdb) s
1166                    SSL_free(conn->ssl);
(gdb) s
*** glibc detected *** corrupted double-linked list: 0x0807e428 ***

Program received signal SIGABRT, Aborted.
0x401bf851 in kill () from /lib/libc.so.6
(gdb)


--Andrew J. Klosterman
andrew5(at)ece(dot)cmu(dot)edu

In response to

Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2006-02-14 21:55:56
Subject: Re: BUG #2246: Bad malloc interactions: ecpg, openssl
Previous:From: Andrew KlostermanDate: 2006-02-14 21:15:17
Subject: Re: BUG #2246: Bad malloc interactions: ecpg, openssl

pgsql-patches by date

Next:From: Mark KirkwoodDate: 2006-02-14 21:50:27
Subject: Re: Free WAL caches on switching segments
Previous:From: Andrew DunstanDate: 2006-02-14 21:17:57
Subject: Re: Patch Submission Guidelines

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group