Re: concurrent IO in postgres?

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: mladen(dot)gogala(at)vmsinfo(dot)com, "wozniak(at)lanl(dot)gov" <wozniak(at)lanl(dot)gov>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: concurrent IO in postgres?
Date: 2011-01-04 13:53:01
Message-ID: 4D23263D.1090608@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jeff Janes wrote:
> There are parameters governing how likely it is that bgwriter falls
> behind in the first place, though.
>
> http://www.postgresql.org/docs/9.0/static/runtime-config-resource.html
>
> In particular bgwriter_lru_maxpages could be made bigger and/or
> bgwriter_delay smaller.
>

Also, one of the structures used for caching the list of fsync requests
the background writer is handling, the thing that results in backend
writes when it can't keep up, is proportional to the size of
shared_buffers on the server. Setting that tunable to a reasonable size
and lowering bgwriter_delay are two things that help most for the
background writer to keep up with overall load rather than having
backends write their own buffers. And the way checkpoints in PostgreSQL
work, having more backend writes is generally not a performance
improving change, even though it does have the property that it gets
more processes writing at once.

The thread opening post here really didn't discuss if any PostgreSQL
server tuning or OS tuning was done to try and optimize performance.
The usual list at
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server is
normally a help.

At the kernel level, the #1 thing I find necessary to get decent bulk
performance in a lot of situations is proper read-ahead. On Linux for
example, you must get the OS doing readahead to compensate for the fact
that PostgreSQL is issuing requests in a serial sequence. It's going to
ask for block #1, then block #2, then block #3, etc. If the OS doesn't
start picking up on that pattern and reading blocks 4, 5, 6, etc. before
the server asks for them, to keep the disk fully occupied and return the
database data fast from the kernel buffers, you'll never reach the full
potential even of a regular hard drive. And the default readahead on
Linux is far too low for modern hardware.

> But bulk copy binary might use a nondefault allocation strategy, and I
> don't know enough about that part of the code to assess the
> interaction of that with bgwriter.
>

It's documented pretty well in src/backend/storage/buffer/README ,
specifically the "Buffer Ring Replacement Strategy" section. Sequential
scan reads, VACUUM, COPY IN, and CREATE TABLE AS SELECT are the
operations that get one of the more specialized buffer replacement
strategies. These all use the same basic approach, which is to re-use a
ring of data rather than running rampant over the whole buffer cache.
The main thing different between them is the size of the ring. Inside
freelist.c the GetAccessStrategy code lets you see the size you get in
each of these modes.

Since PostgreSQL reads and writes through the OS buffer cache in
addition to its own shared_buffers pool, this whole ring buffer thing
doesn't protect the OS cache from being trashed by a big bulk
operation. Your only real defense there is to make shared_buffers large
enough that it retains a decent chunk of data even in the wake of that.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2011-01-04 14:07:22 Re: PostgreSQL
Previous Message Florian Weimer 2011-01-04 13:52:24 Re: adding foreign key constraint locks up table