Re: [PATCH] Add a few suppression rules for Valgrind

From: Andres Freund <andres(at)anarazel(dot)de>
To: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Add a few suppression rules for Valgrind
Date: 2018-02-20 17:28:58
Message-ID: 20180220172858.ol7rndl6o7sh3qvr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> I decided to run the code from master branch under Valgrind and
> discovered that it reports some errors.
>
> There are multiple reports like this one (seems to be a false alarm):
>
> ```
> Invalid read of size 16
> at 0x605F488: __wcsnlen_sse4_1 (in /usr/lib/libc-2.26.so)
> by 0x604F5C2: wcsrtombs (in /usr/lib/libc-2.26.so)
> by 0x5FE4C41: wcstombs (in /usr/lib/libc-2.26.so)
> by 0x6FFAFC: wchar2char (pg_locale.c:1641)
> by 0x6937A0: str_tolower (formatting.c:1599)
> by 0x6F88E8: lower (oracle_compat.c:50)
> by 0x43097D: ExecInterpExpr (execExprInterp.c:678)
> by 0x48B201: ExecEvalExpr (executor.h:286)
> by 0x48BBD6: tfuncInitialize (nodeTableFuncscan.c:369)
> by 0x48B91E: tfuncFetchRows (nodeTableFuncscan.c:299)
> by 0x48B245: TableFuncNext (nodeTableFuncscan.c:65)
> by 0x4462E6: ExecScanFetch (execScan.c:95)
> Address 0x513adab0 is 7,808 bytes inside a recently re-allocated
> block of size 16,384 alloc'd
> at 0x4C2CEDF: malloc (vg_replace_malloc.c:299)
> by 0x7E2807: AllocSetAlloc (aset.c:934)
> by 0x7EF481: palloc (mcxt.c:848)
> by 0x42D2A7: ExprEvalPushStep (execExpr.c:2131)
> by 0x42D7DB: ExecPushExprSlots (execExpr.c:2286)
> by 0x42D723: ExecInitExprSlots (execExpr.c:2265)
> by 0x429B77: ExecBuildProjectionInfo (execExpr.c:370)
> by 0x44A39B: ExecAssignProjectionInfo (execUtils.c:457)
> by 0x4733FE: ExecInitNestLoop (nodeNestloop.c:308)
> by 0x4442F0: ExecInitNode (execProcnode.c:290)
> by 0x43C5EB: InitPlan (execMain.c:1043)
> by 0x43B31C: standard_ExecutorStart (execMain.c:262)

These are libc issues, I don't think we should carry exceptions for
those. Normally valgrind after a while will catch up and internally
ignore them. Here the issue is that the SSE implementation can read a
few trailing bytes at the end of the string, which is harmless.

Normally valgrind fixes this issue by "intercepting" such libc symbols,
but I suspect that some libc optimizations broke that.

On my systems I just include a global valgrind suppression file which
includes libc specific things and then the postgres valgrind.supp. I'm
very hesitant to add suppressions to the codebase for this, seems too
likely to actually hide bugs.

> ```
>
> Also there are many reports like this one (definitely false alarm):
>
> ```
> Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
> at 0x60A2F90: epoll_pwait (in /usr/lib/libc-2.26.so)
> by 0x5F7B24: WaitEventSetWaitBlock (latch.c:1048)
> by 0x5F79E8: WaitEventSetWait (latch.c:1000)
> by 0x493A79: secure_read (be-secure.c:170)
> by 0x4A3DE2: pq_recvbuf (pqcomm.c:963)
> by 0x4A3EAA: pq_getbyte (pqcomm.c:1006)
> by 0x627AD0: SocketBackend (postgres.c:339)
> by 0x628032: ReadCommand (postgres.c:512)
> by 0x62D243: PostgresMain (postgres.c:4086)
> by 0x581240: BackendRun (postmaster.c:4412)
> by 0x5808BE: BackendStartup (postmaster.c:4084)
> by 0x57CA20: ServerLoop (postmaster.c:1757)
> Address 0x0 is not stack'd, malloc'd or (recently) free'd

Yea, this one is just valgrind not really understanding the syscall.

It's been fixed in git's source tree:

commit 3ac87cf9277964802ddd9af9747a10ff0b838c29
Author: Mark Wielaard <mark(at)klomp(dot)org>
Date: 2017-06-17 13:49:22 +0000

epoll_pwait can have a NULL sigmask.

we just need a new valgrind release.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-02-20 17:32:22 Re: Option to ensure monotonic timestamps
Previous Message Konstantin Knizhnik 2018-02-20 17:04:15 Re: Contention preventing locking