Re: BUG #2246: Bad malloc interactions: ecpg, openssl

From: Andrew Klosterman <andrew5(at)ece(dot)cmu(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #2246: Bad malloc interactions: ecpg, openssl
Date: 2006-02-13 21:01:32
Message-ID: Pine.LNX.4.53L-ECE.CMU.EDU.0602131538360.18395@blossom.pdl.cmu.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-patches

On Mon, 13 Feb 2006, Tom Lane wrote:

> Andrew Klosterman <andrew5(at)ece(dot)cmu(dot)edu> writes:
> > I threw in a pthread mutex around the code making the database connections
> > for each of my threads. The problem is still there ("corrupted
> > double-linked list").
>
> > Even tuning things down and instructing my code to only run a single
> > pthread manifests the problem over an SSL connection.
>
> Hmm. Based on that, the problem is starting to smell more like a
> garden-variety memory clobber, for instance malloc'ing a chunk smaller
> than the data that's later stuffed into it. It might be worth running
> the program under something like ElectricFence, which will catch the
> offender on-the-spot rather than only later when corruption of malloc's
> private data structures is detected.
>
> Looking back at your original message, I wonder if it could be the
> combination of ecpg and SSL that triggers it? I'd have thought that
> libpq/SSL alone would be pretty well wrung out, but ecpg is not so
> widely used.
>
> BTW, you did say this was i386 right? If it were a 64-bit architecture,
> I'd be about ready to bet money on the wrong-malloc-size-calculation
> theory.
>
> > Tracking down exactly what's tickling the problem in this case could be
> > tricky...
>
> Yeah :-(. If you aren't able to narrow it further by yourself, please
> try to put together a self-contained test case.
>
> regards, tom lane

I just did the "electric fence" thing for you and this is what I get in
gdb...

Electric Fence 2.1 Copyright (C) 1987-1998 Bruce Perens.

ElectricFence Aborting: Allocating 0 bytes, probably a bug.

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 16384 (LWP 24753)]
0x401c3851 in kill () from /lib/libc.so.6
(gdb) bt
#0 0x401c3851 in kill () from /lib/libc.so.6
#1 0x40139dd5 in EF_Abort () from /usr/lib/libefence.so.0
#2 0x40139823 in memalign () from /usr/lib/libefence.so.0
#3 0x401399ad in malloc () from /usr/lib/libefence.so.0
#4 0x40139a10 in calloc () from /usr/lib/libefence.so.0
#5 0x404a182f in krb5_set_default_tgs_ktypes () from /usr/lib/libkrb5.so.3
#6 0x402c8b3f in ?? () from /usr/lib/libpq.so.4
#7 0x402ded88 in ?? () from /usr/lib/libpq.so.4
#8 0x00000000 in ?? ()

Looks like something fishy going on between libpq and libkrb5. I'm
especially suspicious since I'm not using kerberos for authentication at
all.

I am developing on i386 (more or less).
# uname -m
i686

--Andrew J. Klosterman
andrew5(at)ece(dot)cmu(dot)edu

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2006-02-13 21:08:59 Re: BUG #2246: Bad malloc interactions: ecpg, openssl
Previous Message Tom Lane 2006-02-13 20:16:05 Re: BUG #2246: Bad malloc interactions: ecpg, openssl

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-02-13 21:08:59 Re: BUG #2246: Bad malloc interactions: ecpg, openssl
Previous Message Tom Lane 2006-02-13 20:16:05 Re: BUG #2246: Bad malloc interactions: ecpg, openssl