Quick Links

Re: Why our Valgrind reports suck

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Why our Valgrind reports suck
Date:	2025-05-09 15:29:43
Message-ID:	swvpjwmcriqektjlkfgopqyyn6p5mhul5hlpzfq3g23lea4guf@xfrsjegelsco
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2025-05-08 22:04:06 -0400, Tom Lane wrote:
> A nearby thread [1] reminded me to wonder why we seem to have
> so many false-positive leaks reported by Valgrind these days.
> For example, at exit of a backend that's executed a couple of
> trivial queries, I see
>
> ==00:00:00:25.515 260013== LEAK SUMMARY:
> ==00:00:00:25.515 260013== definitely lost: 3,038 bytes in 90 blocks
> ==00:00:00:25.515 260013== indirectly lost: 4,431 bytes in 61 blocks
> ==00:00:00:25.515 260013== possibly lost: 390,242 bytes in 852 blocks
> ==00:00:00:25.515 260013== still reachable: 579,139 bytes in 1,457 blocks
> ==00:00:00:25.515 260013== suppressed: 0 bytes in 0 blocks
>
> so about a thousand "leaked" blocks, all but a couple of which
> are false positives --- including nearly all the "definitely"
> leaked ones.
>
> Some testing and reading of the Valgrind manual [2] turned up a
> number of answers, which mostly boil down to us using very
> Valgrind-unfriendly data structures. Per [2],
>
> There are two ways a block can be reached. The first is with a
> "start-pointer", i.e. a pointer to the start of the block. The
> second is with an "interior-pointer", i.e. a pointer to the middle
> of the block.
>
> [ A block is reported as "possibly lost" if ] a chain of one or
> more pointers to the block has been found, but at least one of the
> pointers is an interior-pointer.

Huh. We use the memory pool client requests to inform valgrind about memory
contexts. I seem to recall that that "hid" many leak warnings from valgrind. I
wonder if we somehow broke (or weakened) that.

We currently don't reset TopMemoryContext at exit, which, obviously, does
massively increase the number of leaks. But OTOH, without that there's not a
whole lot of value in the leak check...

Greetings,

Andres Freund

In response to

Why our Valgrind reports suck at 2025-05-09 02:04:06 from Tom Lane

Responses

Re: Why our Valgrind reports suck at 2025-05-09 15:50:45 from Andres Freund
Re: Why our Valgrind reports suck at 2025-05-09 17:44:52 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2025-05-09 15:50:45	Re: Why our Valgrind reports suck
Previous Message	Tomas Vondra	2025-05-09 14:57:36	Re: Adding skip scan (including MDAM style range skip scan) to nbtree