Re: General performance/load issue

From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Robert Treat" <rob(at)xzilla(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Gaëtan Allart" <gaetan(at)nexylan(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: General performance/load issue
Date: 2011-11-24 16:02:39
Message-ID: fbb11c915ef5b16d10cf5d8a663e8f7e.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 24 Listopad 2011, 16:39, Robert Treat wrote:
> On Thu, Nov 24, 2011 at 9:09 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> On 24 Listopad 2011, 14:51, Gaëtan Allart wrote:
>>> Postgresql.conf :
>>>
>>> max_connections = 50
>>> shared_buffers = 12G
>>> temp_buffers = 40MB
>>> work_mem = 128MB
>>> maintenance_work_mem = 256MB
>>> max_files_per_process = 8192
>>> checkpoint_segments = 256
>>> checkpoint_timeout = 30min
>>> checkpoint_completion_target = 0.9
>>
>> Fine. Let's see the options that look suspicious.
>>
>
> I think you missed some suspicious settings... I'd recommend setting
> shared buffers to 8gb, and I'd likely reduce checkpoint segements to
> 30 and set the checkpoint timeout back to 5 minutes. Everything about
> the way this server is configured (including those vm settings) is
> pushing it towards delaying the WAL/Buffer/Checkpoint as long as
> possible, which matches with the idea of good performance initial
> followed by a period of poor performance and heavy i/o.

Yes, checkpoints were my first thought too. OTOH the OP reported that most
of the I/O is caused by WAL writer - that's not exactly the part that does
the work during checkpoint. Plus the WAL may not be postponed, as it's
usually O_DIRECT and fsynced, right.

You're right that the writes are postponed, but I generally see that as a
good thing when combined with spread checkpoints. And even with those vm
settings (about 3.2GB for background writes), I wouldn't expect this
behaviour (because the page cache usually expires after 30 seconds). Say
you need 100% of the shared buffers is dirty and need to be written. You
have 27 minutes (30*0.9) to do that - that means about 8MB/s. With 30
seconds expire there might be about 240MB before the pdflush starts to
write the data to the SSD. And that can surely handle more than 50MB/s. So
why the long delay? The question is what else is going on there.

But all this is just guessing - I want to see the log_checkpoint message,
iostat results etc.

> On a side note, I'd guess your work_mem is probably too high. 50
> (connections) x 128 (mb work mem) x 2 (sorts per query) = 12GB RAM,
> which is 25% of total ram on the box. That doesn't necessarily mean
> game over, but it seem like it wouldn't be that hard to get thrashing
> being set up that way. YMMV.

This is one of the reasons why effective_cache_size should be lower than
32GB, probably ...

Tomas

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gaëtan Allart 2011-11-24 16:12:21 Re: General performance/load issue
Previous Message Benjamin Henrion 2011-11-24 15:53:43 Re: Incremental backup with RSYNC or something?