Re: Refactoring backend fork+exec code

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>, Tristan Partin <tristan(at)neon(dot)tech>, Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Refactoring backend fork+exec code
Date: 2023-12-01 20:44:08
Message-ID: a1b69a02-544a-471e-8b95-81256e992248@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/12/2023 16:00, Alexander Lakhin wrote:
> Maybe you could look at issues with running `make check` under Valgrind
> when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND":
> # +++ regress check in src/test/regress +++
> # initializing database system by copying initdb template
> # postmaster failed, examine ".../src/test/regress/log/postmaster.log" for the reason
> Bail out!make[1]: ***
>
> ...
> 2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG:  listening on Unix socket "/tmp/pg_regress-pPFNk0/.s.PGSQL.55312"
> ==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised byte(s)
> ==00:00:00:01.692 1259396==    at 0x5245A37: write (write.c:26)
> ==00:00:00:01.692 1259396==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: new_do_write (fileops.c:448)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196)
> ==00:00:00:01.692 1259396==    by 0x51B1056: fwrite (iofwrite.c:39)
> ==00:00:00:01.692 1259396==    by 0x552E21: internal_forkexec (postmaster.c:4518)
> ==00:00:00:01.692 1259396==    by 0x5546A1: postmaster_forkexec (postmaster.c:4444)
> ==00:00:00:01.692 1259396==    by 0x55471C: StartChildProcess (postmaster.c:5275)
> ==00:00:00:01.692 1259396==    by 0x557B61: PostmasterMain (postmaster.c:1454)
> ==00:00:00:01.692 1259396==    by 0x472136: main (main.c:198)
> ==00:00:00:01.692 1259396==  Address 0x1ffeffdc11 is on thread 1's stack
> ==00:00:00:01.692 1259396==  in frame #4, created by internal_forkexec (postmaster.c:4482)
> ==00:00:00:01.692 1259396==
>
> With memset(param, 0, sizeof(*param)); added at the beginning of
> save_backend_variables(), server starts successfully, but then
> `make check` fails with other Valgrind error.

That's actually a pre-existing issue, I'm seeing that even on unpatched
'master'.

In a nutshell, the problem is that the uninitialized padding bytes in
BackendParameters are written to the file. When we read the file back,
we don't access the padding bytes, so that's harmless. But Valgrind
doesn't know that.

On Windows, the file is created with
CreateFileMapping(INVALID_HANDLE_VALUE, ...) and we write the variables
directly to the mapping. If I understand the Windows API docs correctly,
it is guaranteed to be initialized to zeros, so we don't have this
problem on Windows, only on other platforms with EXEC_BACKEND. I think
it makes sense to clear the memory on other platforms too, since that's
what we do on Windows.

Committed a fix with memset(). I'm not sure what our policy with
backpatching this kind of issues is. This goes back to all supported
versions, but given the lack of complaints, I chose to not backpatch.

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-12-01 20:49:46 Re: encoding affects ICU regex character classification
Previous Message Tom Lane 2023-12-01 19:57:44 Re: Dumped SQL failed to execute with ERROR "GROUP BY position -1 is not in select list"