Skip site navigation (1) Skip section navigation (2)

Re: settings input for upgrade

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: settings input for upgrade
Date: 2011-08-21 19:20:09
Message-ID: 4E515A69.6050709@2ndQuadrant.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 08/18/2011 05:55 PM, Midge Brown wrote:
> DB1 is 10GB and consists of multiple tables that I've spread out so 
> that the 3 most used have their data and indexes on 6 separate RAID1 
> drives, the 3 next busiest have data & index on 3 drives, and the 
> remaining tables and indexes are on the RAID10 drive. The WAL for all 
> is on a separate RAID1 drive.
> DB2 is 25GB with data, index, and WAL all on separate RAID1 drives.
> DB3 is 15GB with data, index, and WAL on separate RAID1 drives.

Anytime you have a set of disks and a set of databases/tables to lay out 
onto them, there are two main options to consider:

-Put all of them into a single RAID10 array.  Performance will be high 
now matter what subset of the database is being used.  But if one 
particular part of a database is really busy, it can divert resources 
away from the rest.

-Break the database into fine-grained pieces and carefully lay out each 
of them on disk.  Performance of any individual chunk will be steady 
here.  But if only a subset of the data is being used, server resources 
will be idle.  All of the disks that don't have data related to that 
will be unused.

Consider two configurations following these ideas:

1) 12 disks are placed into a large RAID10 array.  Peak transfer rate 
will be about 600MB/s on sequential scans.

2) 6 RAID1 arrays are created and the database is manually laid out onto 
those disks.  Peak transfer rate from any one section will be closer to 
100MB/s.

Each of these is optimizing for a different use scenario.  Here's the 
best case for each:

-One user is active, and they're hitting one of the database sections.  
In setup (1) they might get 600MB/s, the case where it shows the most 
benefit.  In setup (2), they'd only get 100MB/s.

-10 users are pounding one section of the database; 1 user is hitting a 
different section.  In setup (2), all 10 users will be fighting over 
access to one section of the disk, each getting (at best) 10MB/s of its 
transfers. The nature of random I/O means that it will likely be much 
worse for them.  Meanwhile, the user hitting the other database section 
will still be merrily chugging away getting their 100MB/s.  Had setup 
(1) been used, you'd have 11 users fighting over 600MB/s, so at best 
55MB/s for each.  And with the random mix, it could be much worse.

Which of these is better?  Well, (1) is guaranteed to use your hardware 
to its fullest capability.  There are some situations where contention 
over the disk array will cause performance to be lower for some people, 
compared to if they had an isolated environment split up more like (2).  
But the rest of the time, (2) will have taken a large number of disks 
and left them idle.  The second example shows this really well.  The 
mere fact that you have such a huge aggregate speed available means that 
the big array really doesn't necessarily suffer that badly from a heavy 
load.  It has 6X as much capacity to handle them.  You really need to 
have a >6:1 misbalance in access before the carefully laid out version 
pulls ahead.  In every other case, the big array wins.

You can defend (2) as the better choice if you have really compelling, 
hard data proving use of the various parts of the data is split quite 
evenly among the expected incoming workload.  If you have response time 
latency targets that require separating resources evenly among the 
various types of users, it can also make sense there.  I don't know if 
the data you've been collecting from your older version is good enough 
to know that for sure or not.

In every other case, you'd be better off just dumping the whole pile 
into a single, large array, and letting the array and operating system 
figure out how to schedule things best.  That why this is the normal 
practice for building PostgreSQL systems.  The sole exception is that 
splitting out the pg_xlog filesystem can usually be justified in a 
larger array.  The fact that it's always sequential I/O means that 
mixing its work with the rest of the server doesn't work as well as 
giving it a dedicated pair of drives to write to, where it doesn't ever 
stop to seek somewhere else.
> wal_buffers = 32MB

This might as well drop to 16MB.  And you've already gotten some 
warnings about work_mem.  Switching to a connection pooler would help 
with that, too.


> autovacuum_analyze_threshold = 250
>
> autovacuum_naptime = 10min
>
> autovacuum_vacuum_threshold = 250
>
> vacuum_cost_delay = 10ms
>

This strikes me as more customization than you really should be doing to 
autovacuum, if you haven't been running on a recent version of 
PostgreSQL yet.  You shouldn't ever need to touch the thresholds for 
example.  Those only matter on really small tables; once something gets 
big enough to really matter, the threshold part is really small compared 
to the scale factor one.  And the defaults are picked partly so that 
cleanup of the system catalog tables is done frequently enough.  You're 
slowing that cleanup by moving the thresholds upward so much, and that's 
not a great idea.

For similar reasons, you really shouldn't be touching autovacuum_naptime 
unless there's really good evidence it's necessary for your environment.

Changing things such that regular vacuums executed at the command line 
happen with a cost delay like this should be fine though.  Those will 
happen using twice as many resources as the autovacuum ones, but not run 
as fast as possible as in the normal case.

> deadlock_timeout = 3s

You probably don't want to increase this.  When you reach the point 
where you want to find slow lock issues by turning on log_lock_waits, 
you're just going to put it right back to the default again--or lower it.

-- 
Greg Smith   2ndQuadrant US    greg(at)2ndQuadrant(dot)com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

In response to

Responses

pgsql-performance by date

Next:From: Scott MarloweDate: 2011-08-22 01:26:18
Subject: Re: settings input for upgrade
Previous:From: Robert KlemmeDate: 2011-08-21 11:15:07
Subject: Re: settings input for upgrade

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group