Re: BUG #14999: pg_rewind corrupts control file global/pg_control

From: TipTop Labs <office(at)tiptop-labs(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14999: pg_rewind corrupts control file global/pg_control
Date: 2018-04-03 20:13:05
Message-ID: A89569FD-1CCB-4060-932B-B61DD0F756BF@tiptop-labs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

> On Apr 3, 2018, at 4:41 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Apr 02, 2018 at 07:27:55PM -0400, Tom Lane wrote:
>> I think that a saner design would involve not truncating any particular
>> target file until we first get some data to write to it. Christian's
>> original patch shown at
>> https://www.postgresql.org/message-id/CAB7nPqTyq3cjphJmoCPc0n8r90O4j1BpZC2BMaj8dKZ7g0-mtw%40mail.gmail.com
>> seems to be headed in that direction, in that it forces ordering of the
>> data fetch (which is likely a good idea anyway to ensure locality of
>> reference) and then performs the truncation when we get the first chunk
>> of data for a file.
>
> Please note that Christopher's patch is awkwardly making the difference
> between a configuration file and a relation file by abusing of chunkoff.
> It also blows up on the test t/006_readonly.pl included in some of my
> previous patches.

IMO this is a better characterization of my original patch (from https://github.com/tiptop-labs/postgres-patches/blob/master/bug_14999/README <https://github.com/tiptop-labs/postgres-patches/blob/master/bug_14999/README>):

bug
===

libpq_fetch.c loops twice over files in pgdata, a first time in
libpq_executeFileMap(), and then a second time (for files with action
FILE_ACTION_COPY or FILE_ACTION_COPY_TAIL) in receiveFileChunks().
For FILE_ACTION_COPY, it deletes (truncates) the file from the target directory
already during the first loop. If pg_rewind then encounters a read-only file
(e.g. server.key) still during the first loop, it exits with pg_fatal
("could not open target file"). After such truncation of global/pg_control
pg_rewind cannot run again ("unexpected control file size 0, , expected 8192")
and a restore from pg_basebackup is needed.

patch
=====

Truncation of the file was moved to the second loop. Truncation occurs there if
chunks are written into files at offset 0. This is the case for
FILE_ACTION_COPY. An additional SQL "ORDER BY path, begin" ensures that these
chunks are processed first.

>> Michael's proposal to have separate error handling for data and config
>> files is not bad in itself, and it does improve the behavior for one
>> foreseeable trouble case. But I think we need the other thing as well,
>> or some variant of it, before we can regard pg_rewind as anything except
>> a chainsaw with the safety catch missing.
>
> I think that I agree with your proposal to delay the file truncation up
> to the moment where the first chunk for a file is received, as well as I
> agree to the fact that enforcing the ordering of the files would give a
> more efficient work base. However, it seems to me that both concepts
> should be included to get a complete patch. In short, a complete patch
> could do in its most complex shape:
> 1) Ordering of the data fetched by libpq.
> 2) Truncation of files a first chunk is received only if it is a
> configuration file.
> 3) Discard any data related to files that cannot be opened:
> 3-1) If the file worked on is a relation file, we have a hard failure.
> 3-2) If the file worked on is a configuration file, then a warning is
> issued and we move on to the next file. This fixes the case of
> read-only files without being user-intrusive.
>
> In the case of 3-2) having a warning instead of a failure is subject to
> discussion though. Perhaps we could live with a simple failure, but
> that's rather user-unfriendly, still we could document the limitation
> and tell users to not put read-only files on the target which is also
> present on the source.
>
> At the same time pg_rewind considers in my patch things like FSM and VM
> files as configuration files, which you want to truncate out of the way
> and fully copy, but also want to fail hard if they cannot be written to!
>
> So at the end a failure for everything may make the most sense, if we
> document that people don't put read-only files in the target's data
> folder which can cause failures.
>
> Note also that if we use isRelDataFile() in receiveFileChunks() it is
> possible to decide if a file needs to be truncated when receiving its
> first chunk, so it is possible to drop completely the concept of file
> type as well. This would make the patch even more simple, and there is
> no need to rely on other weird tweaks like chunkoff or such.
>
> The ordering of the items when doing a local copy is not mandatory as
> files are processed one by one, still I think that it could be more
> solid to reuse isRelDataFile() in rewind_copy_file_range() and drop the
> "trunc" argument.
>
> Thoughts?
> --
> Michael

Regards,
Christian

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message 石勇虎 2018-04-04 00:39:00 can not kill client session
Previous Message Bruce Momjian 2018-04-03 17:23:52 Re: BUG #15122: can't import data if table has a constraint with a function calling another function

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-04-03 20:47:23 Re: Foreign keys and partitioned tables
Previous Message Tomas Vondra 2018-04-03 19:50:55 Re: WIP: BRIN multi-range indexes