Re: silent data loss with ext4 / all current versions

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: silent data loss with ext4 / all current versions
Date: 2016-01-25 07:30:47
Message-ID: CAB7nPqT8kNJ0nhNORppSLnPyT3AbvdZeZEarjYMLd_XtASdMRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On 01/22/2016 06:45 AM, Michael Paquier wrote:
>>> Here are some comments about your patch after a look at the code.
>>>
>>> Regarding the additions in fsync_fname() in xlog.c:
>>> 1) In InstallXLogFileSegment, rename() will be called only if
>>> HAVE_WORKING_LINK is not used, which happens only on Windows and
>>> cygwin. We could add it for consistency, but it should be within the
>>> #else/#endif block. It is not critical as of now.
>>> 2) The call in RemoveXlogFile is not necessary, the rename happening
>>> only on Windows.
>>
>> Hmmm, OK. Are we sure HAVE_WORKING_LINK is false only on Windows, or could
>> there be some other platforms? And are we sure the file systems on those
>> platforms are safe without the fsync call?
>> That is, while the report references ext4, there may be other file systems
>> with the same problem - ext4 was used mostly as it's the most widely used
>> Linux file system.
>
> From pg_config_manual.h:
> #if !defined(WIN32) && !defined(__CYGWIN__)
> #define HAVE_WORKING_LINK 1
> #endif
> If we want to be consistent with what Posix proposes, I am not against
> adding it.

I did some tests with NTFS using cygwin, and the rename() calls remain
even after powering off the VM. But I agree that adding an fsync() in
both cases would be fine.

>>> Thoughts?
>>
>> Thanks for the review and comments. I think the question is whether we only
>> want to do the additional fsync() only when it ultimately may lead to data
>> loss, or even in cases where it may cause operational issues (e.g. switching
>> back to recovery needlessly).
>> I'd vote for the latter, as I think it makes the database easier to operate
>> (less manual interventions) and the performance impact is 0 (as those fsyncs
>> are really rare).
>
> My first line of thoughts after looking at the patch is that I am not
> against adding those fsync calls on HEAD as there is roughly an
> advantage to not go back to recovery in most cases and ensure
> consistent names, but as they do not imply any data loss I would not
> encourage a back-patch. Adding them seems harmless at first sight I
> agree, but those are not actual bugs.

OK. It is true that PGDATA would be fsync'd in 4 code paths with your
patch which are not that much taken:
- Renaming tablespace map file and backup label file (three times)
- Renaming to recovery.done
So, what do you think about the patch attached? Moving the calls into
the critical sections is not really necessary except when installing a
new segment.
--
Michael

Attachment Content-Type Size
xlog-fsync-v3.patch text/x-patch 4.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Torsten Zühlsdorff 2016-01-25 07:48:45 Re: Releasing in September
Previous Message Torsten Zühlsdorff 2016-01-25 07:28:36 Re: Batch update of indexes