Quick Links

Re: Avoiding unnecessary reads in recovery

From:	Jim Nasby <decibel(at)decibel(dot)org>
To:	Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc:	pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Avoiding unnecessary reads in recovery
Date:	2007-04-25 17:13:57
Message-ID:	6B1B90A7-5BF5-42C4-8238-7F774701BB6C@decibel.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Apr 25, 2007, at 2:48 PM, Heikki Linnakangas wrote:
> In recovery, with full_pages_writes=on, we read in each page only
> to overwrite the contents with a full page image. That's a waste of
> time, and can have a surprisingly large effect on recovery time.
>
> As a quick test on my laptop, I initialized a DBT-2 test with 5
> warehouses, and let it run for 2 minutes without think-times to
> generate some WAL. Then I did a "kill -9 postmaster", and took a
> copy of the data directory to use for testing recovery.
>
> With CVS HEAD, the recovery took ~ 2 minutes. With the attached
> patch, it took 5 seconds. (yes, I used the same not-yet-recovered
> data directory in both tests, and cleared the os cache with "echo 1
> > /proc/sys/vm/drop_caches").
>
> I was surprised how big a difference it makes, but when you think
> about it it's logical. Without the patch, it's doing roughly the
> same I/O as the test itself, reading in pages, modifying them, and
> writing them back. With the patch, all the reads are done
> sequentially from the WAL, and then written back in a batch at the
> end of the WAL replay which is a lot more efficient.
>
> It's interesting that (with the patch) full_page_writes can
> *shorten* your recovery time. I've always thought it to have a
> purely negative effect on performance.
>
> I'll leave it up to the jury if this tiny little change is
> appropriate after feature freeze...
>
> While working on this, this comment in ReadBuffer caught my eye:
>
>> /*
>> * During WAL recovery, the first access to any data page should
>> * overwrite the whole page from the WAL; so a clobbered page
>> * header is not reason to fail. Hence, when InRecovery we may
>> * always act as though zero_damaged_pages is ON.
>> */
>> if (zero_damaged_pages || InRecovery)
>> {
>
> But that assumption only holds if full_page_writes is enabled,
> right? I changed that in the attached patch as well, but if it
> isn't accepted that part of it should still be applied, I think.

So what happens if a backend is running with full_page_writes = off,
someone edits postgresql.conf to turns it on and forgets to reload/
restart, and then we crash? You'll come up in recovery mode thinking
that f_p_w was turned on, when in fact it wasn't.

ISTM that we need to somehow log what the status of full_page_writes
is, if it's going to affect how recovery works.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Avoiding unnecessary reads in recovery at 2007-04-25 12:48:51 from Heikki Linnakangas

Responses

Re: Avoiding unnecessary reads in recovery at 2007-04-26 14:10:56 from Zeugswetter Andreas ADI SD
Re: Avoiding unnecessary reads in recovery at 2007-04-26 14:39:28 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2007-04-25 17:20:13	Re: Avoiding unnecessary reads in recovery
Previous Message	Andrew Dunstan	2007-04-25 16:58:48	Re: ECPG failure on BF member Vaquita (Windows Vista)