Re: PATCH: Configurable file mode mask

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Adam Brightwell <adam(dot)brightwell(at)crunchydata(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: Configurable file mode mask
Date: 2018-04-06 04:55:54
Message-ID: 20180406045554.GF4031@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 05, 2018 at 12:08:15PM -0400, David Steele wrote:
> On 4/5/18 2:55 AM, Michael Paquier wrote:
>> On Wed, Apr 04, 2018 at 08:03:54PM -0400, David Steele wrote:
>>
>>> Instead I have created variables in file_perm.c
>>> that hold the current file create mode, dir create mode, and mode mask.
>>> All call sites use those variables (e.g. pg_dir_create_mode), which I
>>> think are much clear.
>>
>> Hm. I personally find that even more confusing, especially with
>> SetDataDirectoryCreatePerm which basically is the same as
>> SetConfigOption for the backend.
>
> The GUC shows the current mode of the data directory, while the
> variables in file_perm.c store the mode that should be used to create
> new dirs/files. One is certainly based on the other but I thought it
> best to split them for clarity.

Yeah, there are arguments to actually have both things split so as they
can be concatenated. This makes the code more modular.

>> You also forgot a call to SetDataDirectoryCreatePerm or
>> pg_dir_create_mode remains to its default.

My apologies, I forgot the last two words of this sentence: "for
pg_rewind". Still, I am wrong because GetDataDirectoryCreatePerm calls
internally SetDataDirectoryCreatePerm. So the API name is confusing
because not only the permissions are fetched, but they are also
applied.

> Are saying *if* a call is forgotten?

That applies as well.

>> The interface of file_perm.h that you are introducing is not confusing
>> anymore though..
>
> Yes, that was the idea. I think it makes it clearer what we are trying
> to do and centralizes variables related to create modes in one place.
>
> I've attached new patches that exclude Windows from permissions tests
> and deal with bootstrap permissions in a better way. PostgresMain() and
> AuxiliaryProcessMain() now use checkDataDir() to set permissions.

GetDataDirectoryCreatePerm() is defined in file_perm.h but it is only
declared for frontends.

At the end, this patch brings in a new abstraction concept to
src/common/ to control a set of pre-determined behaviors:
- The mode to create directories.
- The mode to create files.
- The mode number which can be used for umask.
I am not really convinced that we need to enforce all of them all the
time with a API layer aimed at controlling the behavior of all things at
the same time. Getting this abstraction down one level by allowing each
frontend to set up things the way they want would be nicer in my
opinion. Perhaps others feel differently...

It could be also less confusing if the set of variables in file_perm.h
was designed in such a way that we have the default, and then we can
apply on top of it the set to allow grouping, so as allowing grouping
access would be to do the operation of (default mask + group addition).

The design on the backend is rather neat however there is also
overlapping with SetDataDirectoryCreatePerm() and the GUC
data_directory_mode which are heading toward the same types of goals.
We could reduce that confusion by designing the interface as follows:
- Have read-only GUCs for the directory and file masks on the backend
which get set by the postmaster after looking at the permission of the
data folder at startup. Then if a file or a folder needs to be created,
then look at those values and apply permissions as granted. And also a
GUC to decide the umask to apply but that should not be necessary,
right?
- Frontends are responsible for the decision-making of the permissions.
Not all the variables are used for frontends as well. For example
pg_resetwal and pg_upgrade only touch files.
- For frontends, there are two cases:
-- The client needs to connect to a live Postgres instance, in which
case it can use the SHOW command to get the wanted information. This
applies to pg_rewind with the remote mode (should apply to the second
case actually), pg_basebackup, pg_receivexlog, etc.
-- Binaries work on a local data folder, so permissions can be guessed
from that: pg_rewind, pg_resetwal and pg_upgrade. Having an API in
src/common/ which scans for what to apply is neat. This was in v12 and
some older versions if I recall correctly.

We are two days away from the end of the commit fest, and this patch set
if not yet in a committable stagle, so perhaps at this stage we had
better just retreat and come back to it in next cycle?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-04-06 05:08:36 Re: Excessive PostmasterIsAlive calls slow down WAL redo
Previous Message Amit Langote 2018-04-06 04:51:58 Re: [HACKERS] Runtime Partition Pruning