Re: Large C files

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jan Urbański <wulczer(at)wulczer(dot)org>
Subject: Re: Large C files
Date: 2011-09-09 00:20:22
Message-ID: CA+Tgmob3H9mnh4xBt30ZEO=QRA_oAvT7UzBHMBWpmMpg+Duafw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 8, 2011 at 4:45 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> On 8 September 2011 15:43, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I wouldn't be too enthusiastic about
>> starting a project like this in January, but now seems fine.  A bigger
>> problem is that I don't hear anyone volunteering to do the work.
>
> You seem to have a fairly strong opinion on the xlog.c code. It would
> be useful to hear any preliminary thoughts that you might have on what
> we'd end up with when this refactoring work is finished. If I'm not
> mistaken, you think that it is a good candidate for being refactored
> not so much because of its size, but for other reasons -  could you
> please elaborate on those? In particular, I'd like to know what
> boundaries it is envisaged that the code should be refactored along to
> increase its conceptual integrity, or to better separate concerns. I
> assume that that's the idea, since each new .c file would have to have
> a discrete purpose.

I'm not sure how strong my opinions are, but one thing that I think
could be improved is that StartupXLOG() is really, really long, and it
does a lot of different stuff:

- Read the control file and sanity check it.
- Read the backup label if there is one or otherwise inspect the
control file's checkpoint information.
- Set up for REDO (including Hot Standby) if required.
- Main REDO loop.
- Check whether we reached consistency and/or whether we need a new TLI.
- Prepare to write WAL.
- Post-recovery cleanup.
- Initialize for normal running.

It seems to me that we could improve the readability of that function
by separating out some of the larger chunks of functionality into
their own static functions. That wouldn't make xlog.c any shorter,
but it would make that function easier to understand.

Another gripe I have is that recoveryStopsHere() has non-obvious side
effects. Not sure what to do about that, exactly, but I found it
extremely surprising when first picking my way through this code.

One pretty easy thing to pull out of this file wold be all the
user-callable functions. pg_xlog_recovery_replay_{pause,resume},
pg_is_xlog_recovery_paused, pg_last_xact_replay_timestamp,
pg_is_in_recovery, pg_{start,stop}_backup(), pg_switch_xlog(),
pg_create_restore_point(), pg_current_xlog_{insert_,}location,
pg_last_xlog_{receive,replay}_location(), pg_xlogfile_name_offset(),
pg_xlogfile_name().

Another chunk that seems like it would be pretty simple to separate
out is the checkpoint-related stuff: LogCheckpointStart(),
LogCheckpointEnd(), CreateCheckPoint(), CheckPointGuts(),
RecoveryRestartPoint(), CreateRestartPoint(), KeepLogSeg(). Not a lot
of code, maybe, but it seems clearly distinguishable from what the
rest of the file is about.

I'm sure there are other good ways to do it, too...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-09-09 00:42:03 Re: pg_last_xact_insert_timestamp
Previous Message Josh Berkus 2011-09-09 00:16:59 Re: Patch to improve reliability of postgresql on linux nfs