Re: emergency outage requiring database restart

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Oskari Saarenmaa <os(at)ohmu(dot)fi>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: emergency outage requiring database restart
Date: 2016-11-07 22:31:18
Message-ID: CAHyXU0wOd4m3gc4RhRLsiu-6YRGTOpUCxC9spNQk_cyLkJeY3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 2, 2016 at 10:45 AM, Oskari Saarenmaa <os(at)ohmu(dot)fi> wrote:
> 26.10.2016, 21:34, Andres Freund kirjoitti:
>>
>> Any chance that plsh or the script it executes does anything with the file
>> descriptors it inherits? That'd certainly one way to get into odd corruption
>> issues.
>>
>> We processor really should use O_CLOEXEC for the majority of it file
>> handles.
>
>
> Attached a patch to always use O_CLOEXEC in BasicOpenFile if we're not using
> EXEC_BACKEND. It'd be nice to not expose all fds to most pl-languages
> either, but I guess there's no easy solution to that without forcibly
> closing all fds whenever any functions are called.

FYI, this is not my first run-in with strange behavior, on this thread
(not necessarily worth reading);
https://www.postgresql.org/message-id/CAHyXU0x5mW-SbSuUBEshzumOaN7JPUWa7Ejza68HE-KY0Nq7Kg%40mail.gmail.com

I had a similar set of starting conditions that resulted in very
strange behavior (but not data corruption AFAICT) --the problem
mysteriously disappeared when I fixed some bugs that would cause the
routine to concurrently do the same operation. I would like to point
out that I use both pl/sh (and via it, sqsh) very highly, so these
problems are not necessarily the norm.

Regardless, it seems like you might be on to something, and I'm
inclined to patch your change, test it, and roll it out to production.
If it helps or at least narrows the problem down, we ought to give it
consideration for inclusion (unless someone else can think of a good
reason not to do that, heh!).

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2016-11-07 22:39:12 Re: Replace use malloc() & friend by memory contexts for plperl and pltcl
Previous Message Gilles Darold 2016-11-07 22:29:28 Re: Patch to implement pg_current_logfile() function