Re: Why our Valgrind reports suck

From: Yasir <yasir(dot)hussain(dot)shah(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Why our Valgrind reports suck
Date: 2025-05-13 05:28:51
Message-ID: CAA9OW9eh0+12PekdV8pNtdYFSOMyAJgVU7fop=oWFmf6DQZ-0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 12, 2025 at 12:11 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > And, since there's nothing new under the sun around here,
> > we already had a discussion about that back in 2021:
> >
> https://www.postgresql.org/message-id/flat/3471359.1615937770%40sss.pgh.pa.us
> > That thread seems to have led to fixing some specific bugs,
> > but we never committed any of the discussed valgrind infrastructure
> > improvements. I'll have a go at resurrecting that...
>
> Okay, here is a patch series that updates the
> 0001-Make-memory-contexts-themselves-more-visible-to-valg.patch
> patch you posted in that thread, and makes various follow-up
> fixes that either fix or paper over various leaks. Some of it
> is committable I think, but other parts are just WIP. Anyway,
> as of the 0010 patch we can run through the core regression tests
> and see no more than a couple of kilobytes total reported leakage
> in any process, except for two tests that expose leaks in TS
> dictionary building. (That's fixable but I ran out of time,
> and I wanted to get this posted before Montreal.) There is
> work left to do before we can remove the suppressions added in
> 0002, but this is already huge progress compared to where we were.
>
> A couple of these patches are bug fixes that need to be applied and
> even back-patched. In particular, I had not realized that autovacuum
> leaks a nontrivial amount of memory per relation processed (cf 0009),
> and apparently has done for a few releases now. This is horrid in
> databases with many tables, and I'm surprised that we've not gotten
> complaints about it.
>
> regards, tom lane
>
>
Thanks for sharing the patch series. I've applied the patches on my end and
rerun the tests. Valgrind now reports 8 bytes leakage only, and the
previously noisy outputs are almost entirely gone.
Here's valgrind output:

==00:00:01:50.385 90463== LEAK SUMMARY:
==00:00:01:50.385 90463== definitely lost: 8 bytes in 1 blocks
==00:00:01:50.385 90463== indirectly lost: 0 bytes in 0 blocks
==00:00:01:50.385 90463== possibly lost: 0 bytes in 0 blocks
==00:00:01:50.385 90463== still reachable: 1,182,132 bytes in 2,989
blocks
==00:00:01:50.385 90463== suppressed: 0 bytes in 0 blocks
==00:00:01:50.385 90463== Rerun with --leak-check=full to see details of
leaked memory
==00:00:01:50.385 90463==
==00:00:01:50.385 90463== For lists of detected and suppressed errors,
rerun with: -s
==00:00:01:50.385 90463== ERROR SUMMARY: 0 errors from 0 contexts
(suppressed: 34 from 3)

Regards,

Yasir Hussain
Data Bene

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2025-05-13 05:56:52 Re: Suggestion to add --continue-client-on-abort option to pgbench
Previous Message Jack Ng 2025-05-13 05:03:03 RE: Changing shared_buffers without restart