Re: silent data loss with ext4 / all current versions

From: Greg Stark <stark(at)mit(dot)edu>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: silent data loss with ext4 / all current versions
Date: 2016-01-22 12:41:58
Message-ID: CAM-w4HNdUkN7=8ob3xX-t613L_mWEuzFcRCFZ6hWVCjwM_16_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 01/22/2016 06:45 AM, Michael Paquier wrote:
>
>> So, I have been playing with a Linux VM with VMware Fusion and on
>> ext4 with data=ordered the renames are getting lost if the root
>> folder is not fsync. By killing-9 the VM I am able to reproduce that
>> really easily.
>
>
> Yep. Same experience here (with qemu-kvm VMs).

I still think a better approach for this is to run the database on an
LVM volume and take lots of snapshots. No VM needed, though it doesn't
hurt. LVM volumes are below the level of the filesystem and a snapshot
captures the state of the raw blocks the filesystem has written to the
block layer. The block layer does no caching though the drive may but
neither the VM solution nor LVM would capture that.

LVM snapshots would have the advantage that you can keep running the
database and you can take lots of snapshots with relatively little
overhead. Having dozens or hundreds of snapshots would be unacceptable
performance drain in production but for testing it should be practical
and they take relatively little space -- just the blocks changed since
the snapshot was taken.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-01-22 12:54:27 Re: silent data loss with ext4 / all current versions
Previous Message Jim Nasby 2016-01-22 12:41:29 Re: proposal: PL/Pythonu - function ereport