Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>
Subject: Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)
Date: 2019-08-26 08:57:01
Message-ID: 20190826085701.y7rfxq3nxh2hzdju@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 26, 2019 at 02:34:31PM +1200, Thomas Munro wrote:
>On Mon, Aug 26, 2019 at 1:44 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>> On Mon, Aug 26, 2019 at 01:09:19PM +1200, Thomas Munro wrote:
>> > On Sun, Aug 25, 2019 at 3:15 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> > > I was reminded of this issue from last year, which also appeared to
>> > > involve BufFileClose() and a double-free:
>> > >
>> > > https://postgr.es/m/87y3hmee19.fsf@news-spur.riddles.org.uk
>> > >
>> > > That was a BufFile that was under the control of a tuplestore, so it
>> > > was similar to but different from your case. I suspect it's related.
>> >
>> > Hmm. tuplestore.c follows the same coding pattern as nodeHashjoin.c:
>> > it always nukes its pointer after calling BufFileFlush(), so it
>> > shouldn't be capable of calling it twice for the same pointer, unless
>> > we have two copies of that pointer somehow.
>> >
>> > Merlin's reported a double-free apparently in ExecHashJoin(), not
>> > ExecHashJoinNewBatch() like this report. Unfortunately that tells us
>> > very little.
>
>Here's another one:
>
>https://www.postgresql.org/message-id/flat/20170601081104.1500.56202%40wrigleys.postgresql.org
>
>Hmm. Also on RHEL/CentOS 6, and also involving sorting, hashing,
>BufFileClose() but this time the glibc double free error is in
>repalloc().
>
>And another one (repeatedly happening):
>
>https://www.postgresql.org/message-id/flat/3976998C-8D3B-4825-9B10-69ECB70A597A%40appnexus.com
>
>Also on RHEL/CentOS 6, this time a sort in once case and a hash join
>in another case.
>
>Of course it's entirely possible that we have a bug here and I'm very
>keen to find it, but I can't help noticing the common factor here is
>that they're all running ancient RHEL 6.x releases, except Merlin who
>didn't say. Merlin?
>

It'd be interesting to know the exact glibc version for those machines.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-08-26 09:05:55 Re: refactoring - share str2*int64 functions
Previous Message Greg Nancarrow 2019-08-26 08:01:04 Re: Procedure support improvements