Re: [GENERAL] Slow PITR restore

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jeff Trout <threshar(at)threshar(dot)is-a-geek(dot)com>, Koichi Suzuki <suzuki(dot)koichi(at)oss(dot)ntt(dot)co(dot)jp>
Subject: Re: [GENERAL] Slow PITR restore
Date: 2007-12-14 00:09:24
Message-ID: 200712131609.26419.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Tom,

> [ shrug... ] This is not consistent with my experience. I can't help
> suspecting misconfiguration; perhaps shared_buffers much smaller on the
> backup, for example.

You're only going to see it on SMP systems which have a high degree of CPU
utilization. That is, when you have 16 cores processing flat-out, then
the *single* core which will replay that log could certainly have trouble
keeping up. And this wouldn't be an issue which would show up testing on
a dual-core system.

I don't have extensive testing data on that myself (I depended on Koichi's
as well) but I do have another real-world case where our slow recovery
time is a serious problem: clustered filesystem failover configurations,
e.g. RHCFS, OpenHACluster, Veritas. For those configuratons, when one
node fails PostgreSQL is started on a 2nd node against the same data ...
and goes through recovery. On very high-volume systems, the recovery can
be quite slow, up to 15 minutes, which is a long time for a web site to be
down.

I completely agree that we don't want to risk the reliability of recovery
in attempts to speed it up, though, so maybe this isn't something we can
do right now. But I don't agree that it's not an issue for users.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alex Vinogradovs 2007-12-14 00:20:00 plpgsql trigger coredumps instance
Previous Message Howard Cole 2007-12-13 23:28:41 Re: Killing a session in windows

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-12-14 00:37:39 Re: [GENERAL] Slow PITR restore
Previous Message Gregory Stark 2007-12-13 22:16:39 Re: [GENERAL] Slow PITR restore