Re: could not stat promote trigger file leads to shutdown

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: peter(dot)eisentraut(at)2ndquadrant(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, masao(dot)fujii(at)gmail(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org
Subject: Re: could not stat promote trigger file leads to shutdown
Date: 2019-12-05 01:28:03
Message-ID: 20191205.102803.520432354765994907.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 4 Dec 2019 11:52:33 +0100, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote in
> On 2019-11-20 16:21, Tom Lane wrote:
> >> AFAICT, a GUC check hook wouldn't actually be able to address the
> >> specific scenario I described. At the time the GUC is set, the
> >> containing the directory of the trigger file does not exist yet. This
> >> is currently not an error. The problem only happens if after the GUC
> >> is
> >> set the containing directory appears and is not readable.
> > True, if the hook just consists of trying fopen() and checking the
> > errno. Would it be feasible to insist that the containing directory
> > exist and be readable? We have enough infrastructure that that
> > should only take a few lines of code, so the question is whether
> > or not that's a nicer behavior than we have now.
>
> Is it possible to do this in a mostly bullet-proof way? Just because
> the directory exists and looks pretty good otherwise, doesn't mean we
> can read a file created in it later in a way that doesn't fall afoul
> of the existing error checks. There could be something like SELinux
> lurking, for example.
>
> Maybe some initial checking would be useful, but I think we still need
> to downgrade the error check at use time a bit to not crash in the
> cases that we miss.

+1. Any GUC variables that points to outer, or externally-modifiable
resources, including directories, files, commands can face that kind
of problem. For example a bogus value for archive_command doesn't
preveint server from starting. I understand that the reason is that we
don't have a reliable means to check-up the command before we actually
execute it, but server can (or should) continue running even if it
fails. I think this issue falls into that category.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Schneider 2019-12-05 01:36:16 logical decoding bug: segfault in ReorderBufferToastReplace()
Previous Message Melanie Plageman 2019-12-05 01:24:00 Re: Memory-Bounded Hash Aggregation