Re: Valgrind Memcheck support

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Valgrind Memcheck support
Date: 2013-06-09 21:58:49
Message-ID: 20130609215849.GA5554@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-06-09 17:25:59 -0400, Noah Misch wrote:
> Valgrind's Memcheck tool[1] is handy for finding bugs, but our use of a custom
> allocator limits its ability to detect problems in unmodified PostgreSQL.
> During the 9.1 beta cycle, I found some bugs[2] with a rough patch adding
> instrumentation to aset.c and mcxt.c such that Memcheck understood our
> allocator. I've passed that patch around to a few people over time, and I've
> now removed the roughness such that it's ready for upstream. In hopes of
> making things clearer in the commit history, I've split out a preliminary
> refactoring patch from the main patch and attached each separately.

> Besides the aset.c/mcxt.c instrumentation, this patch adds explicit checks for
> undefined memory to PageAddItem() and printtup(); this has caught C-language
> functions that fabricate a Datum without initializing all bits. The inclusion
> of all this is controlled by a pg_config_manual.h setting. The patch also
> adds a "suppression file" that directs Valgrind to silences nominal errors we
> don't plan to fix.

Very nice work. I've started to do this quite some time back to smoke
out some bugs in code of mine, but never got remotely to a point where
it was submittable. But I already found some bugs with it. So I'd very
happy if this could get committed.

Will take a look.

> I strongly advise installing the latest-available Valgrind, particularly
> because recent releases suffer far less of a performance drop processing the
> instrumentation added by this patch. A "make installcheck" run takes 273
> minutes under Vaglrind 3.6.0 but just 27 minutes under Valgrind 3.8.1.

At least on linux amd64 I'd strongly suggest using something newer than
(afair) 3.8.1, i.e. the svn version. Up to that version it corrupts the
stack alignment inside signal handlers which doesn't get fixed up even
after a fork(). This leads to mysterious segfaults, e.g. during elog()s
due to the usage of sse registers which have stronger alignment
requirements.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2013-06-09 22:39:35 Re: Hard limit on WAL space used (because PANIC sucks)
Previous Message Noah Misch 2013-06-09 21:29:32 9.3 crop of memory errors