Incorrect error handling for two-phase state files resulting in data loss

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Incorrect error handling for two-phase state files resulting in data loss
Date: 2018-07-09 05:03:09
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

This is a follow-up of the following thread where I have touched the
topic of corrupted 2PC files being completely ignored by recovery:
I have posted a patch on this thread, but after more reviews I have
noticed that it does not close all the holes.

What happens is that when a two-phase state file cannot be read by
recovery when the file is on disk, which is what ReadTwoPhaseFile does,
then its data is simply skipped. It is perfectly possible that an
on-disk 2PC is missing at crash recovery if it has been already
processed, so if trying to open the file and seeing an ENOENT, then the
file can be properly skipped. However, if you look at this API, it
skips *all* errors instead of the ones that matter. For example, if a
file needs to be processed and is cannot be opened because of permission
issues, then it is simply skipped. If its size, its CRC or its magic
number is incorrect, the same thing happens. Skipping unconditionally
files this way can result in data loss.

I think that we really need to harden things, by making
ReadTwoPhaseFile() fail hard is it finds something unexpected, which is
in this case anything except trying to open a file which fails on
ENOENT, and that this stuff should be back-patched.


Attachment Content-Type Size
0001-Fail-hard-when-facing-corrupted-two-phase-state-file.patch text/x-diff 7.3 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuro Yamada 2018-07-09 05:18:14 Re: Postgres 11 release notes
Previous Message Imai, Yoshikazu 2018-07-09 03:18:53 RE: Locking B-tree leafs immediately in exclusive mode