Re: will PITR in 8.0 be usable for "hot spare"/"log

From: Eric Kerin <eric(at)bootseg(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: will PITR in 8.0 be usable for "hot spare"/"log
Date: 2004-08-14 22:50:44
Message-ID: 1092523844.8485.36.camel@auh5-0478
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2004-08-14 at 01:11, Tom Lane wrote:
> Eric Kerin <eric(at)bootseg(dot)com> writes:
> > The issues I've seen are:
> > 1. Knowing when the master has finished the file transfer transfer to
> > the backup.
>
> The "standard" solution to this is you write to a temporary file name
> (generated off your process PID, or some other convenient reasonably-
> unique random name) and rename() into place only after you've finished
> the transfer.
Yup, much easier this way. Done.

> > 2. Handling the meta-files, (.history, .backup) (eg: not sleeping if
> > they don't exist)
>
> Yeah, this is an area that needs more thought. At the moment I believe
> both of these will only be asked for during the initial microseconds of
> slave-postmaster start. If they are not there I don't think you need to
> wait for them. It's only plain ol' WAL segments that you want to wait
> for. (Anyone see a hole in that analysis?)
>
Seems to be working fine this way, I'm now just returning ENOENT if they
don't exist.

> > 3. Keeping the backup from coming online before the replay has fully
> > finished in the event of a failure to copy a file, or other strange
> > errors (out of memory, etc).
>
> Right, also an area that needs thought. Some other people opined that
> they want the switchover to occur only on manual command. I'd go with
> that too if you have anything close to 24x7 availability of admins.
> If you *must* have automatic switchover, what's the safest criterion?
> Dunno, but let's think ...

I'm not even really talking about automatic startup on fail over. Right
now, if the recovery_command returns anything but 0, the database will
finish recovery, and come online. This would cause you to have to
re-build your backup system from a copy of master unnecessarily. Sounds
kinda messy to me, especially if it's a false trigger (temporary io
error, out of memory)

What I think might be a better long term approach (but probably more of
an 8.1 thing). Have the database go in to a read-only/replay mode,
accept only read-only commands from users. A replay program opens a
connection to the backup system's postmaster, and tells it to replay a
given file when it becomes available. Once you want the system to come
online, the DBA will call a different function that will instruct the
system to come fully online, and start accepting updates from users.

This could be quite complex, but provides two things: proper log
shipping with status, (without the false fail->db online possibility)
and a read-only replicated backup system(s), which would also be good
for a reporting database.

Thoughts?

Anyway, here's a re-written program for my implementation of log
shipping: http://www.bootseg.com/log_ship.c It operates mostly the
same, but most of the stupid bugs are fixed. The old one was renamed to
http://www.bootseg.com/log_ship.c.ver1 if you really want it.

Thanks,
Eric

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-08-14 22:59:02 Re: [Fwd: Re: [pgsql-hackers-win32] Import from Linux to
Previous Message Tom Lane 2004-08-14 22:25:40 Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to