Re: Proposed WAL changes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Vadim Mikheev" <vmikheev(at)sectorbase(dot)com>
Cc: "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposed WAL changes
Date: 2001-03-07 16:09:25
Message-ID: 9601.983981365@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Vadim Mikheev" <vmikheev(at)sectorbase(dot)com> writes:
>> I have just sent to the pgsql-patches list a rather large set of
> Please send it to me directly - pgsql-patches' archieve is dated by Feb -:(

Done under separate cover.

>> proposed diffs for the WAL code. These changes:
>>
>> * Store two past checkpoint locations, not just one, in pg_control.
>> On startup, we fall back to the older checkpoint if the newer one
>> is unreadable. Also, a physical copy of the newest checkpoint record

> And what to do if older one is unreadable too?
> (Isn't it like using 2 x CRC32 instead of CRC64 ? -:))

Then you lose --- but two checkpoints gives you twice the chance of
recovery (probably more, actually, since it's much more likely that
the previous checkpoint will have reached disk safely).

> And what to do if pg_control was lost? (We already discussed that we
> should read all logs from newest to oldest ones to find checkpoint).

If you have valid WAL files and broken pg_control, then reading the WAL
files is a way to recover. If you have valid pg_control and broken WAL
files, you have a big problem, but using pg_control to generate a new
empty WAL will at least let you get at your heap files.

> And why to keep old log files with older checkpoint?

Not much point in remembering the older checkpoint location if the
associated WAL file is removed...

> Mmmm, how recovery is possible if log was lost? All what could be done
> with DB in the event of corrupted/lost log is dumping data from tables
> *asis*, without any guarantee about consistency.

Exactly. That is still better than not being able to dump the data at
all.

>> * Change XID allocation to work more like OID allocation, so that we
>> can flush XID alloc info to the log before there is any chance an XID
>> will appear in heap files.

> I didn't read you postings about this yet.

See later discussion --- Andreas convinced me that flushing NEXTXID
records to disk isn't really needed after all. (I didn't take the flush
out of my patch yet, but will do so.) I still want to leave the NEXTXID
records in there, though, because I think that XID and OID assignment
ought to work as nearly alike as possible.

>> Before committing this stuff, I intend to prepare a contrib utility that
>> can be used to reset pg_control and pg_xlog. This is mainly for
>> disaster recovery purposes, but as a side benefit it will allow people

> Once again, I would call this "disaster *dump* purposes" -:)
> After such operation DB shouldn't be used for anything but dump!

Fair enough. But we need it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-03-07 16:21:37 Re: WAL & SHM principles
Previous Message Michal Maruka 2001-03-07 16:00:58 Re: psql missing feature