redolog - for discussion

From: jwieck(at)debis(dot)com (Jan Wieck)
To: pgsql-hackers(at)postgreSQL(dot)org (PostgreSQL HACKERS)
Subject: redolog - for discussion
Date: 1998-11-27 21:17:46
Message-ID: m0zjVGt-000EBjC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

here are some details I would like to discuss before beeing
too deep in the implementation that it's hard to change them
later.

Point 1 is an extension of the pg_database relation that is
required to see the actual redolog state of a database. New
fields are:

lastbackup datetime
redomode int4
redoseq1 int4

lastbackup is the time, the last successful full backup was
taken. More precise, the time when pg_dump switched the
backend into online backup mode.

redomode is defined as 0=none, 1=async-logging, 2=sync-
logging, 4=restore, 5=recover, 6=error.

redoseq1 is the sequence number of the redolog file began
when pg_dump switched to online backup mode (this command
implies a logfile switch).

Point 2 is the extension of the querylanguage. All the
statements are restricted to superusers or the owner of the
database. The SQL statements to control the whole process
are:

ALTER DATABASE REDOMODE {NONE | ASYNCHRONOUS | SYNCHRONOUS};

Turns logging for the database on or off. Database must
be in normal operation mode for it (not restore or
recover mode).

ALTER DATABASE BEGIN BACKUP;

Issued by pg_dump before doing anything else.

The command stops ALL other activity in the database, so
pg_dump has time to pull out at least the information
about sequences (actually it does this while getting
tables, might require some changes there so the database
get's back accessible soon).

ALTER DATABASE ONLINE BACKUP;

Issued by pg_dump when it finished the things that
require total exclusive database access.

At this time, a logfile switch is done (only if the
actual database is really logged) and the sequence number
of the new logfile plus the current datetime remembered.
The behaviour of pg_dump's backend changes. It will see a
snapshot of this time (implemented in tqual code) in any
subsequent command and it is totally unable to do
anything that would update the database.

Until the final END BACKUP is given, no VACUUM or DROP
TABLE etc. commands can be run. If they are issued, the
command will be delayed until pg_dump finished.

ALTER DATABASE END BACKUP;

This turns back the special behaviour of pg_dump's
backend. Additionally the remembered time and redolog
sequence are stored in pg_database. pg_dump can read
them out for the final statement in the dump output (see
below).

ALTER DATABASE BEGIN RESTORE;

This command checks that the actual database is just
created and not one single command has been executed
before. It is the first command in pg_dump's output if
the database beeing dumped is a logged one.

It switches the database into restore mode. In this mode,
the first command on a new database connection must be
the special command

RECOVER DATABASE AS USER 'uname'

or an

ALTER DATABASE END RESTORE ...;

When doing the ACL stuff, pg_dump must output a reconnect
(\c) to the database without the additional username and
then issue the special command.

ALTER DATABASE END RESTORE [RECOVERY FROM redolog_seq];

This ends the restore mode. The additional RECOVERY FROM
is put into by pg_dump for logged databases only. It
reads out this information after END BACKUP. If not
given, the database is switched into normal operation
mode without logging. But if given, the sequence number
is stored in pg_database and the database is put into
recover mode. In that mode, only RECOVER commands can be
issued.

RECOVER DATABASE {ALL | UNTIL 'datetime' | RESET};

The database must be in recover mode. If RESET is given,
the mode is switched to ASYNC logging, The lastbackup
field is set to 'epoch' and redoseq1 set to 0. It resets
the database to the state at the backup snapshot time.

For the others, the backend starts the recovery program
which reads the redolog files, establishes database
connections as required and reruns all the commands in
them. If a required logfile isn't found, it tells the
backend and waits for the reply. The backend tells the
user what happened on error (redolog file with seq n
required but not found ...). So the user can put back the
required redolog files and let recover resume (actually
the user pgsql or root must put them back :-).

If the recovery is interrupted (controlling backend
terminates), the database is set into error mode and only
a RECOVER DATABASE RESET will help.

If the recovery finally succeeds, the same as for RESET
happens. The database is online in async logmode.

Since the "destroydb" is also remembered in the redolog,
recovery will stop at least if it hit's that for the
database actually recoverd. This is to prevent faulty
recovery which could occure if someone destroy's one
database, creates a new one with the same name but
different contents that is logged, destroy's it again and
then want's to restore and recover the first.

RECOVER DATABASE CONTINUE;

After beeing told to restore some more redolog files,
this command let's the recovery resume.

RECOVER DATABASE AS USER 'uname';

A special command used in restore and recover mode only.
This is restricted to superusers with usecatupd right
(not db owner) and modifies the current username in the
backend. It's ugly, but the problem is that ordinary
users should not be able to use the database while it is
in restore or recover mode. So the connection cannot be
established like with '\c - username'.

For the restore and recover this means, that a real
superuser with unrestricted access is needed to restore a
database that was dumped with ACL info. But otherwise
one with createdb but not superuser rights could put a
CREATE USER into a file, create a new database and
"restore" that as user pgsql. I'm sure we don't want
this.

###

Whow - hopefully I didn't forgot anything.

All that might look very complicated, but the only commands
someone would really use manually will be

ALTER DATABASE REDOMODE ...;

and

RECOVER DATABASE ...;

Anything else is used by pg_dump and the recovery program.

What I'm trying to implement with it is a behaviour that
makes it possible to backup, restore and recover a database
on a production system without running closed shop. Doing so
for one database will not affect the others in the instance,
since no modification of the hba conf or anything else will
be required. Only the database actually beeing restored is
closed for normal usage.

Hmmm - just have the next idea right now. One more field in
pg_database could tell that the db is shut down or restricted
Someone could disable a single database or restrict usage to
superusers/dbowner for some time, make database readonly etc.

Anyway - does anybody see problems with the above? Do we need
more functionality? Oh yeah - another utility that follows
log and replicates databases onto other systems on the fly.
But let me get this all running first please :-).

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vadim Mikheev 1998-11-27 22:51:26 HeapTuple changed
Previous Message Tom Ivar Helbekkmo 1998-11-27 19:20:16 Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql