Re: fsync reliability

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fsync reliability
Date: 2011-04-22 20:55:30
Message-ID: 4DB1EB42.2090009@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/22/2011 09:32 AM, Simon Riggs wrote:
> OK, that's good, but ISTM we still have a hole during
> RemoveOldXlogFiles() where we don't fsync or open/close the file, just
> rename it.
>

This is also something that many applications rely upon working as hoped
for here, even though it's not technically part of POSIX. Early
versions of ext4 broke that, and it caused a giant outcry of
complaints.
http://www.h-online.com/open/news/item/Ext4-data-loss-explanations-and-workarounds-740671.html
has a good summary. This was broken on ext4 from around 2.6.28 to
2.6.30, but the fix for it was so demanded that it's even been ported by
the relatively lazy distributions to their 2.6.28/2.6.29 kernels.

There may be a small window for metadata issues here if you've put the
WAL on ext2 and there's a crash in the middle of rename. That factors
into why any suggestions I make about using ext2 come with a load of
warnings about the risk of not journaling. It's hard to predict every
type of issue that fsck might force you to come to terms with after a
crash on ext2, and if there was a problem with this path I'd expect it
to show up as something to be reconciled then.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-04-22 21:13:14 Re: Patch for pg_upgrade to turn off autovacuum
Previous Message Tom Lane 2011-04-22 20:50:44 Re: "stored procedures"