write barrier question

From: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: write barrier question
Date: 2010-08-18 19:24:28
Message-ID: AANLkTim0X2oZ61dzvwG_+TWf94DLEr_0Q_hecq57EZ6+@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm just starting the process of trying to tune new hardware, which is
2x quad core xeon, 48GB RAM, 8x300GB SAS 15K drives in RAID 1+0,
2x72GB 15K SAS drives in RAID 1 for WAL and system. It is a PERC 6/i
card with BBU. Write-back cache is enabled. The system volume is
ext3. The large data partition is ext4.

current config changes are as follows (but I've been experimenting
with a variety of settings):

default_statistics_target = 50 # pgtune wizard 2010-08-17
maintenance_work_mem = 1GB # pgtune wizard 2010-08-17
constraint_exclusion = on # pgtune wizard 2010-08-17
checkpoint_completion_target = 0.9 # pgtune wizard 2010-08-17
effective_cache_size = 36GB # sam
work_mem = 288MB # pgtune wizard 2010-08-17
wal_buffers = 8MB # pgtune wizard 2010-08-17
#checkpoint_segments = 16 # pgtune wizard 2010-08-17
checkpoint_segments = 30 # sam
shared_buffers = 11GB # pgtune wizard 2010-08-17
max_connections = 80 # pgtune wizard 2010-08-17
cpu_tuple_cost = 0.0030 # sam
cpu_index_tuple_cost = 0.0010 # sam
cpu_operator_cost = 0.0005 # sam
#random_page_cost = 2.0 # sam

It will eventually be a mixed-use db, but the OLTP load is very light.
ETL for the warehouse side of things does no updates or deletes.
Just inserts and partition drops. I know that
default_statistics_target isn't right for a warehouse workload, but I
haven't gotten to the point of tuning with a production workload, yet,
so I'm leaving the pgtune default.

When running pgbench on a db which fits easily into RAM (10% of RAM =
-s 380), I see transaction counts a little less than 5K. When I go to
90% of RAM (-s 3420), transaction rate dropped to around 1000 ( at a
fairly wide range of concurrencies). At that point, I decided to
investigate the performance impact of write barriers. I tried building
and running the test_fsync utility from the source distribution but
really didn't understand how to interpret the results. So I just
tried the same test with write barriers on and write barriers off (on
both volumes).

With barriers off, I saw a transaction rate of about 1200. With
barriers on, it was closer to 1050. The test had a concurrency of 40
in both cases. From what I understand of the write barrier problem, a
misbehaving controller will flush the cache to disk with every
barrier, so I assume performance would drop a heck of a lot more than
13%. I assume the relatively small performance reduction is just
contention on the write barriers between the 40 backends. I was
hoping someone could confirm this (I could test on and off with lower
concurrency, of course, but that will take hours to complete). It
occurs to me that the relatively small drop in performance may also be
the result of the relatively small db size. Our actual production db
is likely to be closer to 200% of RAM, but the most active data should
be a lot closer to 90% of RAM. Anyway, I could test all of this, but
the testing takes so long (I'm running 30 minutes per test in order to
get any kind of consistency of results) that it is likely faster to
just ask the experts.

I'd also welcome any other tuning suggestions.

Thanks

--sam

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Craig James 2010-08-18 19:56:57 Re: write barrier question
Previous Message Saadat Anwar 2010-08-18 16:25:39 Copy performance issues