Re: Use of fsync; was Re: Pg_upgrade speed for many tables

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Use of fsync; was Re: Pg_upgrade speed for many tables
Date: 2012-12-01 03:43:29
Message-ID: 20121201034329.GG27120@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 26, 2012 at 02:43:19PM -0500, Bruce Momjian wrote:
> > >> In any event, I think the documentation should caution that the
> > >> upgrade should not be deemed to be a success until after a system-wide
> > >> sync has been done. Even if we use the link rather than copy method,
> > >> are we sure that that is safe if the directories recording those links
> > >> have not been fsynced?
> > >
> > > OK, the above is something I have been thinking about, and obviously you
> > > have too. If you change fsync from off to on in a cluster, and restart
> > > it, there is no guarantee that the dirty pages you read from the kernel
> > > are actually on disk, because Postgres doesn't know they are dirty.
> > > They probably will be pushed to disk by the kernel in less than one
> > > minute, but still, it doesn't seem reliable. Should this be documented
> > > in the fsync section?
> > >
> > > Again, another reason not to use fsync=off, though your example of the
> > > file copy is a good one. As you stated, this is a problem with the file
> > > copy/link, independent of how Postgres handles the files. We can tell
> > > people to use 'sync' as root on Unix, but what about Windows?
> >
> > I'm pretty sure someone mentioned the way to do that on Windows in
> > this list in the last few months, but I can't seem to find it. I
> > thought it was the initdb fsync thread.
>
> Yep, the code is already in initdb to fsync a directory --- we just need
> a way for pg_upgrade to access it.

I have developed the attached patch that does this. It basically adds
an --sync-only option to initdb, then turns off all durability in
pg_upgrade and has pg_upgrade run initdb --sync-only; this give us
another nice speedup!

------ SSD ---- -- magnetic ---
git patch git patch
1 11.11 11.11 11.10 11.13
1000 20.57 19.89 20.72 19.30
2000 28.02 25.81 28.50 27.53
4000 42.00 43.59 46.71 46.84
8000 89.66 74.16 89.10 73.67
16000 157.66 135.98 159.97 153.48
32000 316.24 296.90 334.74 308.59
64000 814.97 715.53 797.34 727.94

(I am very happy with these times. Thanks to Jeff Janes for his
suggestions.)

I have also added documentation to the 'fsync' configuration variable
warning about dirty buffers and recommending flushing them to disk
before the cluster is crash-recovery safe.

I consider this patch ready for 9.3 application (meaning it is not a
prototype).

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
fsync.diff text/x-diff 6.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-12-01 04:10:23 --single-transaction hack to pg_upgrade does not work
Previous Message Andres Freund 2012-12-01 00:25:45 Re: Hot Standby Feedback should default to on in 9.3+