WAL replay logic (was Re: [PERFORM] Mount options for Ext3?)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Brown <kevin(at)sysexperts(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: WAL replay logic (was Re: [PERFORM] Mount options for Ext3?)
Date: 2003-01-25 05:40:33
Message-ID: 6917.1043473233@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> One question I have is: in the event of a crash, why not simply replay
> all the transactions found in the WAL? Is the startup time of the
> database that badly affected if pg_control is ignored?

Interesting thought, indeed. Since we truncate the WAL after each
checkpoint, seems like this approach would no more than double the time
for restart. The win is it'd eliminate pg_control as a single point of
failure. It's always bothered me that we have to update pg_control on
every checkpoint --- it should be a write-pretty-darn-seldom file,
considering how critical it is.

I think we'd have to make some changes in the code for deleting old
WAL segments --- right now it's not careful to delete them in order.
But surely that can be coped with.

OTOH, this might just move the locus for fatal failures out of
pg_control and into the OS' algorithms for writing directory updates.
We would have no cross-check that the set of WAL file names visible in
pg_xlog is sensible or aligned with the true state of the datafile area.
We'd have to take it on faith that we should replay the visible files
in their name order. This might mean we'd have to abandon the current
hack of recycling xlog segments by renaming them --- which would be a
nontrivial performance hit.

Comments anyone?

> If there exists somewhere a reasonably succinct description of the
> reasoning behind the current transaction management scheme (including
> an analysis of the pros and cons), I'd love to read it and quit
> bugging you. :-)

Not that I know of. Would you care to prepare such a writeup? There
is a lot of material in the source-code comments, but no coherent
presentation.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-01-25 06:30:25 I am back
Previous Message Kevin Brown 2003-01-25 04:46:51 Re: Windows Build System was: Win32 port patches submitted

Browse pgsql-performance by date

  From Date Subject
Next Message Curt Sampson 2003-01-25 07:59:17 Re: WAL replay logic (was Re: [PERFORM] Mount options for Ext3?)
Previous Message Curt Sampson 2003-01-25 04:20:49 Re: Having trouble with backups (was: Re: Crash Recovery)