WAL Optimisation - configuration and usage

From: Rob Fielding <rob(at)dsvr(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: WAL Optimisation - configuration and usage
Date: 2004-02-28 18:40:12
Message-ID: 4040E08C.1040408@dsvr.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-hackers-pitr pgsql-performance

Hi,

There alot here, so skip to the middle from my WAL settings if you like.

I'm currently investigating the performance on a large database which
consumes email designated as SPAM for the perusal of customers wishing
to check. This incorporates a number of subprocesses - several delivery
daemons, an expiry daemon and a UI which performs large selects. A
considerable amount of UPDATE, SELECT and DELETE are performed continually.

Starting with a stock pg config, I've well understood the importance
increased shared mem, effective cache size and low random_page_cost as
detailed in
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html. After some
system analysis with vmstat and sar we've been able to determin that the
main problem is IO bound and IMO this is due to lots of updates
requiring high drive contention - the array is a RAID0 mirror and the
dataset originally 79GB. Alot of SPAM is being sent from our mail
scanners and coupled with the UI is creating an increasingly lagging system.

Typically all our db servers have these sort of enhancements - 1GB ram,
SMP boxen with SCSI 160 disks :
effective_cache_size = 95694
random_page_cost = 0.5
sort_mem=65536
max_connections = 128
shared_buffers = 15732

My focus today has been on WAL - I've not looked at WAL before. By
increasing the settings thus :

wal_buffers = 64 # need to determin WAL usage
wal_files = 64 # range 0-64
wal_sync_method = fsync # the default varies across platforms:
wal_debug = 0 # range 0-16

# hopefully this should see less LogFlushes per LogInsert - use more WAL
though.
commit_delay = 10000 # range 0-100000
commit_siblings = 2 # range 1-1000
checkpoint_segments = 16 # in logfile segments (16MB each), min 1
checkpoint_timeout = 300 # in seconds, range 30-3600
fsync = true

great improvements have been seen. A vacuumdb -f -a -z went from
processing 1 table in 10 minutes to 10 tables in 1 minute. :) I actually
stopped it after 80 tables (48 hours runtime) because the projected end
time would have been next week. Once I restarted the postmaster with the
above WAL settings, vacuumdb -f -a -z completed all 650 tables by the
following day.

My thinking is therefore to reduce disk context switching as best as
possible within the current hardware limitiations. I'm aiming at keeping
the checkpoint subprocess happy that other backends are about to commit
- hence keep siblings low at 2 - and create a sufficient gap between
internal commital so many commits can be done in a single sync. From the
above config, I believe I've gone some way to acheive this and the
performance I'm now seeing suggests this.

But I think we can get more out of this as the above setting were picked
from thin air and my concern here is being able to determin WAL file
usage and if the system is caught out on the other extreme that we're
not commiting fast enough. Currently I've read that WAL files shouldn't
be more than 2*checkpoint_segments+1 however my pg_xlog directory
contains 74 files. This suggests I'm using more logfiles than I should.
Also I'm not sure what wal_buffers really should be set to.

Can I get any feedback on this ? How to look into pg's WAL usage would
be what I'm looking for. BTW this is an old install I'm afraid 7.2.2 -
it's been impossible to upgrade up until now because it's been too slow.
I have moved the pg_xlog onto the root SCSI disk - it doesn't appear to
have made a huge difference but it could be on the same cable.

Additional information as a bit of background :
I can supply sar output if required. I'm currently running our expiry
daemon which scans all mail for each domain (ie each table) and this
seems to take a few hours to run on a 26GB archive. It's alot faster
than it ever was. Load gets to about 8 as backends are all busy doing
selects, updates and deletes. This process has recently already been run
so it shouldn't be doing too much deleting. Still seems IO bound, and I
don't think I'm going to solve that without a better disk arrangement,
but this is essentially what I'm doing now - exhausting other possibilities.

$ sar -B -s 16:00:00

16:35:55 pgpgin/s pgpgout/s activepg inadtypg inaclnpg inatarpg
16:36:00 3601.60 754.40 143492 87791 10230 48302
16:36:05 5766.40 552.80 143947 88039 10170 48431
16:36:10 3663.20 715.20 144578 88354 9075 48401
16:36:15 3634.40 412.00 144335 88405 9427 48433
16:36:20 5578.40 447.20 143626 88545 9817 48397
16:36:25 4154.40 469.60 143640 88654 10388 48536
16:36:30 3504.00 635.20 143538 88763 9992 48458
16:36:35 3540.80 456.00 142515 88949 10444 48381
16:36:40 3334.40 1067.20 143268 89244 9832 48468

$ vmstat 5
procs memory swap io
system cpu
r b w swpd free buff cache si so bi bo in cs us
sy id
0 7 1 29588 10592 15700 809060 1 0 97 75 0 103
13 9 79
3 8 0 29588 11680 15736 807620 0 0 3313 438 1838 3559 19
13 68
2 13 1 29588 12808 15404 800328 0 0 4470 445 1515 1752
7 7 86
0 9 1 29588 10992 15728 806476 0 0 2933 781 1246 2686 14
10 76
2 5 1 29588 11336 15956 807884 0 0 3354 662 1773 5211 27
17 57
4 5 0 29696 13072 16020 813872 0 24 4282 306 2632 7862 45
25 31
4 6 1 29696 10400 16116 815084 0 0 5086 314 2668 7893 47
26 27
9 2 1 29696 13060 16308 814232 27 0 3927 748 2586 7836 48
29 23
3 8 1 29696 10444 16232 812816 3 0 4015 433 2443 7180 47
28 25
8 4 0 29696 10904 16432 812488 0 0 4537 500 2616 8418 46
30 24
4 6 2 29696 11048 16320 810276 0 0 6076 569 1893 3919 20
14 66
0 5 0 29696 10480 16600 813788 0 0 4595 435 2400 6215 33
21 46
3 6 0 29696 10536 16376 812248 0 0 3802 504 2417 7921 43
25 32
1 6 1 29696 11236 16500 809636 0 0 3691 357 2171 5199 24
15 61
0 14 1 29696 10228 16036 801368 0 0 4038 561 1566 3288 16
12 72

Sorry it's so long but I thought some brief info would be better than
not. Thanks for reading,

--

Rob Fielding
Development
Designer Servers Ltd

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ramanujam H S Iyengar 2004-02-28 18:47:27 How to get Relation name from Oid ??
Previous Message Richard Huxton 2004-02-28 17:59:50 Re: [ADMIN] Schema comparisons

Browse pgsql-hackers-pitr by date

  From Date Subject
Next Message Richard Huxton 2004-02-28 19:37:26 Re: WAL Optimisation - configuration and usage
Previous Message Simon Riggs 2004-02-26 21:22:50 Re: Archival API

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2004-02-28 19:37:26 Re: WAL Optimisation - configuration and usage
Previous Message Shridhar Daithankar 2004-02-28 08:38:41 Re: FreeBSD config