Re: Initial 9.2 pgbench write results

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Initial 9.2 pgbench write results
Date: 2012-02-18 19:35:53
Message-ID: CA+TgmoYyPszAHXTzMn_cjqk=qY7O0Cs5_QxfbZ+ETTKhsrOpqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 14, 2012 at 3:25 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> On 02/14/2012 01:45 PM, Greg Smith wrote:
>>
>> scale=1000, db is 94% of RAM; clients=4
>> Version TPS
>> 9.0  535
>> 9.1  491 (-8.4% relative to 9.0)
>> 9.2  338 (-31.2% relative to 9.1)
>
> A second pass through this data noted that the maximum number of buffers
> cleaned by the background writer is <=2785 in 9.0/9.1, while it goes as high
> as 17345 times in 9.2.  The background writer is so busy now it hits the
> max_clean limit around 147 times in the slower[1] of the 9.2 runs.  That's
> an average of once every 4 seconds, quite frequent.  Whereas max_clean
> rarely happens in the comparable 9.0/9.1 results.  This is starting to point
> my finger more toward this being an unintended consequence of the background
> writer/checkpointer split.

I guess the question that occurs to me is: why is it busier?

It may be that the changes we've made to reduce lock contention are
allowing foreground processes to get work done faster. When they get
work done faster, they dirty more buffers, and therefore the
background writer gets busier. Also, if the background writer is more
reliably cleaning pages even during checkpoints, that could have the
same effect. Backends write fewer of their own pages, therefore they
get more real work done, which of course means dirtying more pages.
But I'm just speculating here.

> Thinking out loud, about solutions before the problem is even nailed down, I
> wonder if we should consider lowering bgwriter_lru_maxpages now in the
> default config?  In older versions, the page cleaning work had at most a 50%
> duty cycle; it was only running when checkpoints were not.

Is this really true? I see CheckpointWriteDelay calling BgBufferSync
in 9.1. Background writing would stop during the sync phase and
perhaps slow down a bit during checkpoint writing, but I don't think
it was stopped completely.

I'm curious what vmstat output looks like during your test. I've
found that's a good way to know whether the system is being limited by
I/O, CPU, or locks. It'd also be interesting to know what the %
utilization figures for the disks looked like.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Urbański 2012-02-18 19:38:54 Re: Potential reference miscounts and segfaults in plpython.c
Previous Message Tom Lane 2012-02-18 19:30:58 Potential reference miscounts and segfaults in plpython.c