Re: pg_rewind failure by file deletion in source server

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_rewind failure by file deletion in source server
Date: 2015-06-26 19:07:59
Message-ID: CA+TgmoYQ6vaE_wnS_UadD4r7Y3Bx5pKgnqfV-+duU0YFvR836A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 24, 2015 at 9:41 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Wed, Jun 24, 2015 at 3:43 PM, Michael Paquier wrote:
>>> I just realized another problem: We recently learned the hard way that some
>>> people have files in the data directory that are not writeable by the
>>> 'postgres' user
>>> (http://www.postgresql.org/message-id/20150523172627.GA24277@msg.df7cb.de).
>>> pg_rewind will try to overwrite all files it doesn't recognize as relation
>>> files, so it's going to fail on those. A straightforward fix would be to
>>> first open the destination file in read-only mode, and compare its contents,
>>> and only open the file in write mode if it has changed. It would still fail
>>> when the files really differ, but I think that's acceptable.
>>
>> If I am missing nothing, two code paths need to be patched here:
>> copy_file_range and receiveFileChunks. copy_file_range is
>> straight-forward. Now wouldn't it be better to write the contents into
>> a temporary file, compare their content, and then switch if necessary
>> for receiveFileChunks?
>
> After sleeping on it, I have been looking at this issue again and came
> up with the patch attached. Instead of checking if the content of the
> target and the source file are the same, meaning that we would still
> need to fetch chunk content from the server in stream mode, I think
> that it is more robust to check if the target file can be opened and
> check for EACCES on failure, bypassing it if process does not have
> permissions on it. the patch contains a test case as well, and is
> independent on the rest sent upthread.
> Thoughts?

That seems scary as heck to me. Suppose that you run pg_rewind and
SELinux decides, due to some labeling problem, to deny access to some
files. So we just skip those. Then the user tries to start the
server and maybe it works (since the postgres executable is labeled
differently) or maybe it fails and the user runs restorecon and then
it works. Kaboom, your database is corrupt.

I realize that the recent fsync fiasco demonstrated that people keep
files not readable by PG in the data directory, but that still seems
loopy to me. I realize that we can't de-support that in the back
branches because people have existing configurations that we can't
just break. But maybe we should say, look, that's just not compatible
with pg_rewind, because where the integrity of your data is concerned
"oops, I can't read this, that's probably OK" just does not seem good
enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-06-26 19:09:03 Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)
Previous Message Robert Haas 2015-06-26 18:55:54 Re: 9.5 release notes