Re: Avoiding unnecessary reads in recovery

From: Jim Nasby <decibel(at)decibel(dot)org>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding unnecessary reads in recovery
Date: 2007-04-25 17:13:57
Message-ID: 6B1B90A7-5BF5-42C4-8238-7F774701BB6C@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Apr 25, 2007, at 2:48 PM, Heikki Linnakangas wrote:
> In recovery, with full_pages_writes=on, we read in each page only
> to overwrite the contents with a full page image. That's a waste of
> time, and can have a surprisingly large effect on recovery time.
>
> As a quick test on my laptop, I initialized a DBT-2 test with 5
> warehouses, and let it run for 2 minutes without think-times to
> generate some WAL. Then I did a "kill -9 postmaster", and took a
> copy of the data directory to use for testing recovery.
>
> With CVS HEAD, the recovery took ~ 2 minutes. With the attached
> patch, it took 5 seconds. (yes, I used the same not-yet-recovered
> data directory in both tests, and cleared the os cache with "echo 1
> > /proc/sys/vm/drop_caches").
>
> I was surprised how big a difference it makes, but when you think
> about it it's logical. Without the patch, it's doing roughly the
> same I/O as the test itself, reading in pages, modifying them, and
> writing them back. With the patch, all the reads are done
> sequentially from the WAL, and then written back in a batch at the
> end of the WAL replay which is a lot more efficient.
>
> It's interesting that (with the patch) full_page_writes can
> *shorten* your recovery time. I've always thought it to have a
> purely negative effect on performance.
>
> I'll leave it up to the jury if this tiny little change is
> appropriate after feature freeze...
>
> While working on this, this comment in ReadBuffer caught my eye:
>
>> /*
>> * During WAL recovery, the first access to any data page should
>> * overwrite the whole page from the WAL; so a clobbered page
>> * header is not reason to fail. Hence, when InRecovery we may
>> * always act as though zero_damaged_pages is ON.
>> */
>> if (zero_damaged_pages || InRecovery)
>> {
>
> But that assumption only holds if full_page_writes is enabled,
> right? I changed that in the attached patch as well, but if it
> isn't accepted that part of it should still be applied, I think.

So what happens if a backend is running with full_page_writes = off,
someone edits postgresql.conf to turns it on and forgets to reload/
restart, and then we crash? You'll come up in recovery mode thinking
that f_p_w was turned on, when in fact it wasn't.

ISTM that we need to somehow log what the status of full_page_writes
is, if it's going to affect how recovery works.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-04-25 17:20:13 Re: Avoiding unnecessary reads in recovery
Previous Message Andrew Dunstan 2007-04-25 16:58:48 Re: ECPG failure on BF member Vaquita (Windows Vista)