Re: Postgres not using indexes

From: Lawrence Cohan <LCohan(at)web(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Postgres not using indexes
Date: 2011-04-21 19:09:01
Message-ID: 965AA5440EAC094E9F722519E285ACEDB368961C2F@WWCEXCHANGE.web.web.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello Kevin,

We managed to put together a new test server running PG 9.0.2 on 2socketsx6cores = 12CPU with 64 GB RAM against a 3PAR 10TB SAN. We kept the settings I submitted already (and enclosed below) and after 12 hours of pounding the box with PGBENCH running 8 scripts to perform all of INSERT/UPDATE/DELETE/SELECT statements we wanted we got a pretty good picture of what can do with those settings. We got a load average of 60 with CPU up and around that 60% mark, pushing through about 1400 transactions per second for 12 hours. We made the changes as suggested and listed below but the throughput dropped from 1400 t/s to 400 t/s and I suspect are the "stalled" transactions you mentioned about.

Here's what we changed:

Current Settings Test Settings
================================================================
shared_buffers = 500MB shared_buffers = 8GB
effective_cache_size = 2GB effective_cache_size = 32GB

Just to be 100% accurate we ask you what do you mean by:

1) "Make the background writer more aggressive and/or back off on
shared_memory, so that there isn't such a glut of dirty pages to
Write during a checkpoint."

By aggressive does he mean changing any of the following?
# - Background Writer -

#bgwriter_delay = 200ms # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0 # 0-10.0 multiplier on buffers scanned/round

Or we should be better of by the checkpoint segment handling - any of the below:
# - Checkpoints -

checkpoint_segments = 64 # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min # range 30s-1h
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
checkpoint_warning = 60s # 0 disables

Best regards,
Lawrence Cohan.

-----Original Message-----
From: pgsql-bugs-owner(at)postgresql(dot)org [mailto:pgsql-bugs-owner(at)postgresql(dot)org] On Behalf Of Kevin Grittner
Sent: March-30-11 4:12 PM
To: pgsql-bugs(at)postgresql(dot)org; Lawrence Cohan
Subject: Re: [BUGS] Postgres not using indexes

Lawrence Cohan <LCohan(at)web(dot)com> wrote:

> looks like we will need to change at least the two values below
> and maybe play with work_mem to see if it solves our issues.

You will probably get better throughput by bumping up
shared_buffers to the recommended setting, but beware of "stalls" in
query processing at checkpoint time. If that happens you want to
make the background writer more aggressive and/or back off on
shared_memory, so that there isn't such a glut of dirty pages to
write during a checkpoint.

I think even the recommended setting for effective_cache size is on
the low side. This one affects how expensive the optimizer thinks
index usage will be, so given your current problem this is probably
important to raise. I add up shared_buffers and what free tells me
is cached space is after PostgreSQL has been running a while. That
usually winds up being 1GB to 2GB less than total memory on our
machines, so actually, I usually just start there.

We usually need to reduce random_page_cost to get good plans. For a
fully-cached database you may want to reduce both seq_page_cost and
random_page_cost to equal numbers around 0.05. With partial
caching, we often leave seq_page_cost alone and reduce
random_page_cost to 2. YMMV.

The setting for work_mem can be tricky, especially with 1200
connections configured. Each connection may be using one or more
allocations of work_mem at the same time. Which leads to the
question of why you have 1200 connections configured. You are
almost always better off using a connection pooler to limit this to
something on the order of twice your CPU cores plus your effective
spindle count. Tomcat has a very good connection pooler built in,
as do many other products. There are also good external poolers,
like pgpool and pgbouncer.

With a reasonable amount of RAM you're almost always better off
bumping wal_buffers to 32MB.

> The only issue is that we are running a 24/7 web site against the
> db and if we need to restart PG for the changes to take place we
> will need to wait for a downtime before any changes can be made.

Some of these can be set per user with ALTER ROLE. New connections
would then start using the new settings with no down time.

-Kevin

--
Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attention:
The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2011-04-21 19:37:42 Re: Postgres not using indexes
Previous Message Heikki Linnakangas 2011-04-21 18:53:47 Re: BUG #5297: Add XATMI C API