Quick Links

Re: stress test for parallel workers

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: stress test for parallel workers
Date:	2019-08-06 23:57:23
Message-ID:	CA+hUKGL6cDyb2maq2P60cEsjFK=3saBCAj7sDzE3jysL-PRwqg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

chipmunk also:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2019-08-06%2014:16:16

I wondered if the build farm should try to report OOM kill -9 or other
signal activity affecting the postmaster.

On some systems (depending on sysctl kernel.dmesg_restrict on Linux,
security.bsd.unprivileged_read_msgbuf on FreeBSD etc) you can run
dmesg as a non-root user, and there the OOM killer's footprints or
signaled exit statuses for processes under init might normally be found,
but that seems a bit invasive for the host system (I guess you'd
filter it carefully). Unfortunately it isn't enabled on many common
systems anyway.

Maybe there is a systemd-specific way to get the info we need without
being root?

Another idea: start the postmaster under a subreaper (Linux 3.4+
prctl(PR_SET_CHILD_SUBREAPER), FreeBSD 10.2+
procctl(PROC_REAP_ACQUIRE)) that exists just to report on its
children's exit status, so the build farm could see "pid XXX was
killed by signal 9" message if it is nuked by the OOM killer. Perhaps
there is a common subreaper wrapper out there that would wait, print
messages like that, rince and repeat until it has no children and then
exit, or perhaps pg_ctl or even a perl script could do somethign like
that if requested. Another thought, not explored, is the brand new
Linux pidfd stuff that can be used to wait and get exit status for a
non-child process (or the older BSD equivalent), but the paint isn't
even dry on that stuff anwyay.

--
Thomas Munro
https://enterprisedb.com

In response to

Re: stress test for parallel workers at 2019-07-24 05:15:14 from Tom Lane

Responses

Re: stress test for parallel workers at 2019-08-07 04:29:19 from Tom Lane
Re: stress test for parallel workers at 2019-08-07 13:30:46 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2019-08-07 01:10:36	Re: Refactoring code stripping trailing \n and \r from strings
Previous Message	Stephen Frost	2019-08-06 23:42:08	Re: no default hash partition