Re: fsync reliability

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: fsync reliability
Date: 2011-04-25 19:53:08
Message-ID: BANLkTinCTTbPd_uc3p3cvutnkRTzSMk-sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 25, 2011 at 5:00 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Stop right there; the slow path was the only one that had any hope of being
> correct.  It can actually slow things by a factor of 100X or more,
> worst-case.  "So, we currently have the choice between filesystem corruption
> or major performance loss":  yes, you do.  Writing files is tricky and it
> can either be slow or safe.  If you're going to avoid even trying to enforce
> the right thing here, you're really going to get really burned.

Well no. That's like saying the whole database can't possibly process
transactions faster than the rate at which fsyncs can happen. That's
not true because we can process transactions in parallel and fsync a
whole bunch simultaneously.

The API tytso and company are suggesting is that if you want
reasonable performance you should create a thread for each file, fsync
in that thread and then do your rename. Hardly the sanest API one
could imagine.

And if you fail to do that you don't just risk losing data. You get a
filesystem state that *never* existed. It's as if we said that if the
database crashes your transaction might be rolled back, it might be
committed, and we might just replace your data with zeros. Huh?

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-04-25 20:08:46 Re: branching for 9.2devel
Previous Message Joshua D. Drake 2011-04-25 19:47:30 Re: branching for 9.2devel