Re: Getting better results from valgrind leak tracking

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Getting better results from valgrind leak tracking
Date: 2021-03-18 02:05:19
Message-ID: 20210318020519.4ok4q7sgjt536u36@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-03-17 21:30:48 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2021-03-17 18:07:36 -0400, Tom Lane wrote:
> >> Huh, interesting. I wonder why that makes the ident problem go away?
>
> > I spent a bit of time looking at valgrind, and it looks to be explicit
> > behaviour:
> >
> > // Our goal is to construct a set of chunks that includes every
> > // mempool chunk, and every malloc region that *doesn't* contain a
> > // mempool chunk.
>
> Ugh.
>
> > I guess that makes sense, but would definitely be nice to have had
> > documented...
>
> Indeed. So this started happening to us when we merged the aset
> header with its first allocation block.

Yea.

I have the feeling that valgrinds error detection capability in our
codebase used to be higher too. I wonder if that could be related
somehow. Or maybe it's just a figment of my imagination.

> I believe I've just tracked down the cause of that. Those errors
> are (as far as I've seen) only happening in parallel workers, and
> the reason is this gem in RestoreGUCState:
>
> /* See comment at can_skip_gucvar(). */
> for (i = 0; i < num_guc_variables; i++)
> if (!can_skip_gucvar(guc_variables[i]))
> InitializeOneGUCOption(guc_variables[i]);
>
> where InitializeOneGUCOption zeroes out that GUC variable, causing
> it to lose track of any allocations it was responsible for.

Ouch. At least it's a short lived issue rather than something permanently
leaking memory on every SIGHUP...

> At minimum, somebody didn't understand the difference between "initialize"
> and "reset". But I suspect we could just nuke this loop altogether.
> I've not yet tried to grok the comment that purports to justify it,
> but I fail to see why it'd ever be useful to drop values inherited
> from the postmaster.

I can't really make sense of it either. I think it may be trying to
restore the GUC state to what it would have been in postmaster,
disregarding all the settings that were set as part of PostgresInit()
etc?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-03-18 02:16:06 Re: New IndexAM API controlling index vacuum strategies
Previous Message Japin Li 2021-03-18 01:51:09 Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW