Re: Slow PITR restore

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Trout <threshar(at)threshar(dot)is-a-geek(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Slow PITR restore
Date: 2007-12-13 19:12:26
Message-ID: 20071213111226.10cd641a@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 12 Dec 2007 23:08:35 -0800
"Joshua D. Drake" <jd(at)commandprompt(dot)com> wrote:

> Tom Lane wrote:
> > "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> >> Tom Lane wrote:
> >>> You sure about that? I tested CVS HEAD just now, by setting the
> >>> checkpoint_ parameters really high,
> >
> >> ... And:
> >
> >>> 2007-12-13 00:55:20 EST LOG: restored log file
> >>> "00000001000007E10000006B" from archive
> >
> > Hmm --- I was testing a straight crash-recovery scenario, not
> > restoring from archive. Are you sure your restore_command script
> > isn't responsible for a lot of the delay?
>
> Now that's an interesting thought, I will review in the morning when
> I have some more IQ points back.

As promised :)... I took a look at this today and I think I found a
couple of things. It appears that once the logs are archived, the
recovery command copies the archive file to a recovery location and
then restores the file.

If that is correct that could explain some of the latency I am seeing
here. Even with the speed of these devices, it is still a 16 MB file.
That could take 1-2 seconds to copy.

There is also the execution of pg_standby each time as the recovery
command which although I haven't timed is going to add overhead.

Based on the logs I pasted we are showing a delay of 6, 14, 3, 13, 4
and then another 6 seconds.

When are fsyncs called on the recovery process?

At these types of delays even speeding the process 2 seconds per log is
going to be significant.

Sincerely,

Joshua D. Drake

- --
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHYYQaATb/zqfZUUQRAiiNAKCNDaO+MYDDLM/lUbL4D9Q9NIEyRQCgqhye
cJ2PAv9rEzAi/jDFPzzoFNw=
=xNMz
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-12-13 19:37:02 Re: Hash join in 8.3
Previous Message Gregory Stark 2007-12-13 19:12:04 Re: Hash join in 8.3

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-12-13 20:25:07 Re: [GENERAL] Slow PITR restore
Previous Message Josh Berkus 2007-12-13 17:49:06 Re: [HACKERS] "distributed checkpoint"