Re: Buildfarm failures for hash indexes: buffer leaks

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Buildfarm failures for hash indexes: buffer leaks
Date: 2018-10-26 20:38:02
Message-ID: CAMkU=1wBdzS7MbiTSLjRDLJwLTKGMn9Bisw6332sAhQg6GsjmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 23, 2018 at 10:51 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2018-10-23 13:54:31 +0200, Fabien COELHO wrote:
> >
> > Hello Tom & Amit,
> >
> > > > > Both animals use gcc experimental versions, which may rather
> underline a
> > > > > new bug in gcc head rather than an existing issue in pg. Or not.
> > >
> > > > It is possible, but what could be the possible theory?
> > >
> > > It seems like the two feasible theories are (1) gcc bug, or (2) buffer
> > > leak that only occurs in very narrow circumstances, perhaps from a race
> > > condition. Given that the hash index code hasn't changed meaningfully
> > > in several months, I thought (1) seemed more probable.
> >
> > Yep, that is my thought as well.
>
>
> FWIW, my animal 'serinus', which runs debian's gcc-snapshot shows the same
> problem:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2018-10-22%2006%3A34%3A02
>
> So it seems much more likely to be 1).
>
>
> > The problem is that this kind of issue is not simple to wrap-up as a gcc
> bug
> > report, unlike other earlier instances that I forwarded to clang & gcc
> dev
> > teams.
> >
> > I'm in favor in waiting before trying to report it, to check whether the
> > probable underlying gcc problem is detected, reported by someone else,
> and
> > fixed in gcc head. If it persists, then we'll see.
>
> I suspect the easiest thing to narrow it down would be to bisect the
> problem in gcc :(
>

Their commit r265241 is what broke the PostgreSQL build. It also broke the
compiler itself--at that commit it was no longer possible to build itself.
I had to --disable-bootstrap in order to get a r265241 compiler to test
PostgreSQL on.

Their commit r265375 fixed the ability to compile itself, but built
PostgreSQL binaries remain broken there and thereafter.

I ran this on a AWS c5d.4xlarge
ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20180912
(ami-0f65671a86f061fcd)

Configuring PostgreSQL with ./configure --enable-cassert is necessary in
order for the subsequent "make check" to experience the failure.

I'm using PostgreSQL v12dev commit 5953c99697621174f, but the problem is
not sensitive to that and reproduces back to v10.0.

I haven't the slightest idea what to do with this information, and GCC's
cryptic SVN commit messages don't offer much insight to me.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-10-26 21:03:21 Re: Comment fix and question about dshash.c
Previous Message Michael Meskes 2018-10-26 18:43:18 Re: [PROPOSAL]a new data type 'bytea' for ECPG