Re: Refactoring backend fork+exec code

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tristan Partin <tristan(at)neon(dot)tech>, Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Refactoring backend fork+exec code
Date: 2023-12-02 03:00:01
Message-ID: 5e976369-2925-e0cc-b5a1-e9e356264596@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Heikki,

01.12.2023 23:44, Heikki Linnakangas wrote:
>
>> With memset(param, 0, sizeof(*param)); added at the beginning of
>> save_backend_variables(), server starts successfully, but then
>> `make check` fails with other Valgrind error.
>
> That's actually a pre-existing issue, I'm seeing that even on unpatched 'master'.

Thank you for fixing that!

Yes, I had discovered it before, but yesterday I decided to check whether
your patches improve the situation...

What bothered me additionally, is an error detected after server start. I
couldn't see it without the patches applied. I mean, on HEAD I now see
`make check` passing, but with the patches it fails:
...
# parallel group (20 tests):  interval date numerology polygon box macaddr8 macaddr multirangetypes line timestamp
timetz timestamptz time circle strings lseg inet md5 path point
not ok 22    + strings                                  1048 ms
# (test process exited with exit code 2)
not ok 23    + md5                                      1052 ms
# (test process exited with exit code 2)
...
src/test/regress/log/postmaster.log contains:
==00:00:00:30.730 1713480== Syscall param write(buf) points to uninitialised byte(s)
==00:00:00:30.730 1713480==    at 0x5245A37: write (write.c:26)
==00:00:00:30.730 1713480==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180)
==00:00:00:30.730 1713480==    by 0x51BC84F: new_do_write (fileops.c:448)
==00:00:00:30.730 1713480==    by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254)
==00:00:00:30.730 1713480==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196)
==00:00:00:30.730 1713480==    by 0x51B1056: fwrite (iofwrite.c:39)
==00:00:00:30.730 1713480==    by 0x5540CF: internal_forkexec (postmaster.c:4526)
==00:00:00:30.730 1713480==    by 0x5543C0: bgworker_forkexec (postmaster.c:5624)
==00:00:00:30.730 1713480==    by 0x555477: do_start_bgworker (postmaster.c:5665)
==00:00:00:30.730 1713480==    by 0x555738: maybe_start_bgworkers (postmaster.c:5928)
==00:00:00:30.730 1713480==    by 0x556072: process_pm_pmsignal (postmaster.c:5080)
==00:00:00:30.730 1713480==    by 0x556610: ServerLoop (postmaster.c:1761)
==00:00:00:30.730 1713480==    by 0x557BE2: PostmasterMain (postmaster.c:1469)
==00:00:00:30.730 1713480==    by 0x47216B: main (main.c:198)
==00:00:00:30.730 1713480==  Address 0x1ffeffd8c0 is on thread 1's stack
==00:00:00:30.730 1713480==  in frame #4, created by internal_forkexec (postmaster.c:4482)
==00:00:00:30.730 1713480==
...
2023-12-02 05:14:30.751 MSK client backend[1713740] pg_regress/rangetypes FATAL:  terminating connection due to
unexpected postmaster exit
2023-12-02 05:14:31.033 MSK client backend[1713734] pg_regress/numeric FATAL:  postmaster exited during a parallel
transaction
TRAP: failed Assert("!IsTransactionOrTransactionBlock()"), File: "pgstat.c", Line: 591, PID: 1713734

I haven't looked deeper yet, but it seems that we see two issues here (and
Assert is not directly caused by the patches set.)

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Davin Shearer 2023-12-02 03:00:29 Re: Emitting JSON to file using COPY TO
Previous Message Thomas Munro 2023-12-02 02:30:09 Re: processes stuck in shutdown following OOM/recovery