Re: Hot standby, recovery infra

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, recovery infra
Date: 2009-01-28 12:42:34
Message-ID: 1233146554.2327.2343.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Wed, 2009-01-28 at 12:04 +0200, Heikki Linnakangas wrote:
> I've been reviewing and massaging the so called recovery infra patch.

Thanks.

> I feel quite good about this patch now. Given the amount of code
> churn, it requires testing, and I'll read it through one more time
> after sleeping over it.

There's nothing major I feel we should discuss.

The way restartpoints happen is a useful improvement, thanks.

> Simon, do you see anything wrong with this?

Few minor points

* I think we are now renaming the recovery.conf file too early. The
comment says "We have already restored all the WAL segments we need from
the archive, and we trust that they are not going to go away even if we
crash." We have, but the files overwrite each other as they arrive, so
if the last restartpoint is not in the last restored WAL file then it
will only exist in the archive. The recovery.conf is the only place
where we store the information on where the archive is and how to access
it, so by renaming it out of the way we will be unable to crash recover
until the first checkpoint is complete. So the way this was in the
original patch is the correct way to go, AFAICS.

* my original intention was to deprecate log_restartpoints and would
still like to do so. log_checkpoints does just as well for that. Even
less code than before...

* comment on BgWriterShmemInit() refers to CHECKPOINT_IS_STARTUP, but
the actual define is CHECKPOINT_STARTUP. Would prefer the "is" version
because it matches the IS_SHUTDOWN naming.

* In CreateCheckpoint() the if test on TruncateSubtrans() has been
removed, but the comment has not been changed (to explain why).

* PG_CONTROL_VERSION bump should be just one increment, to 844. I
deliberately had it higher to help spot mismatches earlier, and to avoid
needless patch conflicts.

So it looks pretty much ready for commit very soon.

We should continue to measure performance of recovery in the light of
these changes. I still feel that fsyncing the control file on each
XLogFileRead() will give a noticeable performance penalty, mostly
because we know doing exactly the same thing in normal running caused a
performance penalty. But that is easily changed and cannot be done with
any certainty without wider feedback, so no reason to delay code commit.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2009-01-28 12:56:38 Re: posix_fadvise v22
Previous Message Stephen Frost 2009-01-28 11:54:47 Re: pg_upgrade project status