Re: [HACKERS] Infrastructure changes for recovery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] Infrastructure changes for recovery
Date: 2008-09-08 17:34:01
Message-ID: 1905.1220895241@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> Included patch with the following changes:

> * new postmaster mode known as consistent recovery, entered only when
> recovery passes safe/consistent point. InRedo is now set in all
> processes when started/connected in consistent recovery mode.

I looked at this and didn't like the InRedo signaling at all. In the
first place, it looks like you're expecting the flag to be passed down
via fork(), but that won't work in EXEC_BACKEND mode. (Yes, easily
fixed, but it's wrong as-is.) In the second place, the method of
signaling it to the bgwriter looks to have race conditions: in
principle, at least, I think the startup process could try to clear
the shared-memory InRedo flag before the bgwriter has gotten as far as
setting it. (This might work if the startup process can't exit redo
mode without the bgwriter having finished a checkpoint, but such a
dependency is too rube goldbergian for me.)

ISTM that it would probably be better if there were exactly one InRedo
flag in shared memory, probably in xlog.c's shared state, with the
postmaster not being responsible for setting or clearing it; rather
the startup process should do those things.

> * bgwriter and stats process starts in consistent recovery mode.
> bgwriter changes mode when startup process completes.

I'm not sure about the interaction of this. In particular, what about
recovery restart points before we have reached the safe stop point?
I don't think we want to give up the capability of having those.

Also, it seems pretty bogus to update the in-memory ControlFile
checkpoint values before the restart point is actually done. It looks
to me like what you have done is to try to use those fields as signaling
for the restart request in addition to their existing purposes, which
I think is confusing and probably dangerous. I'd rather there were a
different signaling path and ControlFile maintains its currrent
definition.

I found the changes in bgwriter.c unreadable. You've evidently
moved blocks of code around, but exactly what did you mean to
change? Why is there so much duplicated code now?

> I've kept the distinction between restartpoints and checkpoints in
> code, to avoid convoluted code.

Hmm, I dunno, it seems like that might be a bad choice. Are you sure
it's not cleaner to just use the regular checkpoint code?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-09-08 18:01:50 Re: [HACKERS] Infrastructure changes for recovery
Previous Message Andrew Gierth 2008-09-08 17:22:27 Re: sql2008 diff sql2003

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2008-09-08 18:01:50 Re: [HACKERS] Infrastructure changes for recovery
Previous Message Tom Lane 2008-09-08 14:49:18 Re: hash index improving v3