From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: stress test for parallel workers |
Date: | 2019-08-07 14:30:51 |
Message-ID: | af515ace-4956-e720-5c6e-0c3743723dcf@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 07/08/2019 16:57, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> On 07/08/2019 02:57, Thomas Munro wrote:
>>> On Wed, Jul 24, 2019 at 5:15 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> So I think I've got to take back the assertion that we've got
>>>> some lurking generic problem. This pattern looks way more
>>>> like a platform-specific issue. Overaggressive OOM killer
>>>> would fit the facts on vulpes/wobbegong, perhaps, though
>>>> it's odd that it only happens on HEAD runs.
>
>>> chipmunk also:
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2019-08-06%2014:16:16
>
>> FWIW, I looked at the logs in /var/log/* on chipmunk, and found no
>> evidence of OOM killings. I can see nothing unusual in the OS logs
>> around the time of that failure.
>
> Oh, that is very useful info, thanks. That seems to mean that we
> should be suspecting a segfault, assertion failure, etc inside
> the postmaster. I don't see any TRAP message in chipmunk's log,
> so assertion failure seems to be ruled out, but other sorts of
> process-crashing errors would fit the facts.
>
> A stack trace from the crash would be mighty useful info along
> about here. I wonder whether chipmunk has the infrastructure
> needed to create such a thing. From memory, the buildfarm requires
> gdb for that, but not sure if there are additional requirements.
It does have gdb installed.
> Also, if you're using systemd or something else that thinks it
> ought to interfere with where cores get dropped, that could be
> a problem.
I think they should just go to a file called "core", I don't think I've
changed any settings related to it, at least. I tried "find / -name
core*", but didn't find any core files, though.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-08-07 14:45:25 | Re: stress test for parallel workers |
Previous Message | Tom Lane | 2019-08-07 14:17:25 | Re: Regression test failure in regression test temp.sql |