Re: Replaying 48 WAL files takes 80 minutes

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Jeff *EXTERN*" <jeff(at)jefftrout(dot)com>, "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Replaying 48 WAL files takes 80 minutes
Date: 2012-10-30 08:50:44
Message-ID: D960CB61B694CF459DCFB4B0128514C2089A60EC@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>> On Mon, Oct 29, 2012 at 6:05 AM, Albe Laurenz
<laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>>> I am configuring streaming replication with hot standby
>>> with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64).
>>> PostgreSQL was compiled from source.
>>>
>>> It works fine, except that starting the standby took for ever:
>>> it took the system more than 80 minutes to replay 48 WAL files
>>> and connect to the primary.
>>>
>>> Can anybody think of an explanation why it takes that long?

Jeff Janes wrote:
>> Could the slow log files be replaying into randomly scattered pages
>> which are not yet in RAM?
>>
>> Do you have sar or vmstat reports?

The sar reports from the time in question tell me that I read
about 350 MB/s and wrote less than 0.2 MB/s. The disks were
fairly busy (around 90%).

Jeff Trout wrote:
> If you do not have good random io performance log replay is nearly
unbearable.
>
> also, what io scheduler are you using? if it is cfq change that to
deadline or noop.
> that can make a huge difference.

We use the noop scheduler.
As I said, an identical system performed well in load tests.

The sar reports give credit to Jeff Janes' theory.
Why does WAL replay read much more than it writes?
I thought that pretty much every block read during WAL
replay would also get dirtied and hence written out.

I wonder why the performance is good in the first few seconds.
Why should exactly the pages that I need in the beginning
happen to be in cache?

And finally: are the numbers I observe (replay 48 files in 80
minutes) ok or is this terribly slow as it seems to me?

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Albe Laurenz 2012-10-30 09:02:19 Re: Request for help with slow query
Previous Message Albe Laurenz 2012-10-30 08:25:27 Re: Slow query, where am I going wrong?