Re: silent data loss with ext4 / all current versions

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: silent data loss with ext4 / all current versions
Date: 2016-01-19 07:03:25
Message-ID: CAB7nPqSP5OcsL_wVBiUa7Rt98UtMU5t9u-tqHrTT2rdwRR3fXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 19, 2016 at 3:58 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>
> On 01/19/2016 07:44 AM, Michael Paquier wrote:
>>
>> On Wed, Dec 2, 2015 at 3:24 PM, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>>
>>> On Wed, Dec 2, 2015 at 3:23 PM, Michael Paquier
>>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>>>
>>>> On Wed, Dec 2, 2015 at 7:05 AM, Tomas Vondra
>>>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>>>
>>>>> Attached is v2 of the patch, that
>>>>>
>>>>> (a) adds explicit fsync on the parent directory after all the rename()
>>>>> calls in timeline.c, xlog.c, xlogarchive.c and pgarch.c
>>>>>
>>>>> (b) adds START/END_CRIT_SECTION around the new fsync_fname calls
>>>>> (except for those in timeline.c, as the START/END_CRIT_SECTION is
>>>>> not available there)
>>>>>
>>>>> The patch is fairly trivial and I've done some rudimentary testing, but
>>>>> I'm
>>>>> sure I haven't exercised all the modified paths.
>>>>
>>>>
>>>> I would like to have an in-depth look at that after finishing the
>>>> current CF, I am the manager of this one after all... Could you
>>>> register it to 2016-01 CF for the time being? I don't mind being
>>>> beaten by someone else if this someone has some room to look at this
>>>> patch..
>>>
>>>
>>> And please feel free to add my name as reviewer.
>>
>>
>> Tomas, I am planning to have a look at that, because it seems to be
>> important. In case it becomes lost on my radar, do you mind if I add
>> it to the 2016-03 CF?
>
>
> Well, what else can I do? I have to admit I'm quite surprised this is still
> rotting here, considering it addresses a rather serious data loss /
> corruption issue on pretty common setup.

Well, I think you did what you could. And we need to be sure now that
it gets in and that this patch gets a serious lookup. So for now my
guess is that not loosing track of it would be a good first move. I
have added it here to attract more attention:
https://commitfest.postgresql.org/9/484/
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-01-19 07:11:49 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Previous Message Etsuro Fujita 2016-01-19 06:59:22 Re: Odd behavior in foreign table modification (Was: Re: Optimization for updating foreign tables in Postgres FDW)