Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)
Date: 2019-08-27 22:18:56
Message-ID: CAHyXU0yU+zWp=z-sG-_tujyF3JCgDfrPLTZ5TQKbw-HaU=_VgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 25, 2019 at 9:35 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Mon, Aug 26, 2019 at 1:44 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > On Mon, Aug 26, 2019 at 01:09:19PM +1200, Thomas Munro wrote:
> > > On Sun, Aug 25, 2019 at 3:15 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > > > I was reminded of this issue from last year, which also appeared to
> > > > involve BufFileClose() and a double-free:
> > > >
> > > > https://postgr.es/m/87y3hmee19.fsf@news-spur.riddles.org.uk
> > > >
> > > > That was a BufFile that was under the control of a tuplestore, so it
> > > > was similar to but different from your case. I suspect it's related.
> > >
> > > Hmm. tuplestore.c follows the same coding pattern as nodeHashjoin.c:
> > > it always nukes its pointer after calling BufFileFlush(), so it
> > > shouldn't be capable of calling it twice for the same pointer, unless
> > > we have two copies of that pointer somehow.
> > >
> > > Merlin's reported a double-free apparently in ExecHashJoin(), not
> > > ExecHashJoinNewBatch() like this report. Unfortunately that tells us
> > > very little.
>
> Here's another one:
>
> https://www.postgresql.org/message-id/flat/20170601081104.1500.56202%40wrigleys.postgresql.org
>
> Hmm. Also on RHEL/CentOS 6, and also involving sorting, hashing,
> BufFileClose() but this time the glibc double free error is in
> repalloc().
>
> And another one (repeatedly happening):
>
> https://www.postgresql.org/message-id/flat/3976998C-8D3B-4825-9B10-69ECB70A597A%40appnexus.com
>
> Also on RHEL/CentOS 6, this time a sort in once case and a hash join
> in another case.
>
> Of course it's entirely possible that we have a bug here and I'm very
> keen to find it, but I can't help noticing the common factor here is
> that they're all running ancient RHEL 6.x releases, except Merlin who
> didn't say. Merlin?

Just noticed this.
redhat-release: "Red Hat Enterprise Linux Server release 6.9 (Santiago)"

merlin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2019-08-27 22:52:49 Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)
Previous Message Tom Lane 2019-08-27 22:04:53 Re: doc: clarify "pg_signal_backend" default role