Re: initdb -S and tablespaces

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: initdb -S and tablespaces
Date: 2015-04-03 16:32:32
Message-ID: 20150403163232.GA28444@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Abhijit Menon-Sen wrote:
> At 2015-01-15 14:32:45 +0100, andres(at)2ndquadrant(dot)com wrote:

> Patch attached.
>
> Changes:
>
> 1. Renamed perform_fsync to fsync_recursively (otherwise it would read
> "fsync_pgdata(pg_data)")

Okay, but as far as I can tell this function is very specific to
PGDATA; you couldn't use it in any other directory (or pg_tblspc would
be missing and that would cause an error, right?) Therefore I think it
would make sense to have the name reflect this; maybe
fsync_datadir_recursively(data_directory)
or
fsync_pgdata_recursively(data_directory)
would work? But then, since the name is already telling us that it's
all about PGDATA, maybe we can remove the "recursively" part of the
name? Not sure about any of this; other opinions?

I also noticed that walkdir() thinks it is completely agnostic on what
the passed action is; except that the comment at the bottom talks about
how fsync on directories is important, which seems out of place.

I wonder about walktblspc_links() too. Seems to me that that function
is pretty much the same as walkdir(); would it work to add a flag to the
latter to change the behavior in whatever way needs to be changed, and
remove the former? Hmm ... Actually, since surely we must follow
symlinks everywhere, why do we have to do this separately for pg_tblspc?
Shouldn't that link-following occur automatically when walking PGDATA in
the first place?

> 2. Added ControlData->fsync_disabled to record whether fsync was ever
> disabled while the server was running (tested in CreateCheckPoint)
> 3. Run fsync_recursively at startup only if fsync is enabled
> 4. Run it if we're doing crash recovery, or fsync was disabled
> 5. Use pg_flush_data in pre_sync_fname
> 6. Issue fsync on directories too
> 7. Tested that it works if pg_xlog is a symlink (no changes).
>
> (In short, everything you mentioned in your earlier mail.)
>
> Note that I set ControlData->fsync_disabled to false in BootstrapXLOG,
> but it gets set to true during a later CreateCheckPoint(). This means
> we run fsync again at startup after initdb. I'm not sure what to do
> about that.

This all looks reasonable to me. I just noticed, though, that
the fd.c routines test enableFsync and do nothing if it's not enabled;
but fsync_recursively goes to all the trouble of doing stuff even if
disabled, and the actions are skipped later; the enableFsync check is
then responsibility of the caller. This seems a bit prone to later
confusion. Maybe fsync_recursively should Assert() that it's only being
called with enableFsync=on; or perhaps we can have it return early if
it's unset.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2015-04-03 17:39:45 Re: Abbreviated keys for Numeric
Previous Message Abhijit Menon-Sen 2015-04-03 15:37:48 Re: a fast bloat measurement tool (was Re: Measuring relation free space)