Re: try to find out the checkpoint record?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: try to find out the checkpoint record?
Date: 2004-03-14 05:59:14
Message-ID: 18664.1079243954@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:
> Currently we need to read pg_control to know the location(LSN) of the
> checkpoint record. This means if pg_control is lost or corrupted, we
> have to give up the database recovery. I think we could start from the
> first WAL segment and read through entire WAL logs to find out the
> latest valid checkpoint record. This may take considerable amount of
> time, but still better than giving up recovery IMO. Any reason we
> cannot do this?

Is it worth worrying about? I don't recall that we've ever heard of a
loss-of-pg_control failure in the field. Certainly it *could* happen,
but I can gin up plenty of implausible scenarios where scanning pg_xlog
for a checkpoint would give the wrong answer as well. (Our habit of
recycling xlog segments by renaming them makes us vulnerable to
confusion over filenames, for example.) Since pg_control is
deliberately kept to less than one disk block and is written only once
per checkpoint, you'd have to be really unlucky to lose it anyway.

Also, you can rebuild pg_control from scratch using pg_resetxlog,
so loss of pg_control is not in itself worse than loss of the pg_xlog
directory.

My feeling is that pg_clog is by far the most fragile part of the
logging mechanism at the moment: two very critical bits per transaction
and essentially no error checking. If you want to improve reliability,
think about how to make clog more robust.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2004-03-14 10:31:53 Re: try to find out the checkpoint record?
Previous Message Claudio Natoli 2004-03-14 05:44:37 Re: Regression failure for floats