Re: Performance degradation of inserts when database size grows

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance degradation of inserts when database size grows
Date: 2011-05-24 21:46:06
Message-ID: 4DDC271E.2020101@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 05/17/2011 08:45 AM, Andrey Vorobiev wrote:
> 1. How does database size affect insert performance?

As indexes grow, it becomes slower to insert into them. It has to
navigate all of the indexes on the table to figure out where to add the
new row into there, and that navigation time goes up when tables are
larger. Try using the queries at
http://wiki.postgresql.org/wiki/Disk_Usage to quantify how big your
indexes are. Many people are absolutely shocked to see how large they
become. And some database designers throw indexes onto every possible
column combination as if they were free.

> 2. Why does number of written buffers increase when database size grows?
>

As indexes grow, the changes needed to insert more rows get spread over
more blocks too.

You can install pg_buffercache and analyze what's actually getting dirty
in the buffer cache to directly measure what's changing here. If you
look at http://projects.2ndquadrant.com/talks and download the "Inside
the PostgreSQL Buffer Cache" talk and its "Sample Queries" set, those
will give you some examples of how to summarize everything.

> 3. How can I further analyze this problem?

This may not actually be a problem in that it's something you can
resolve. If you assume that you can insert into a giant table at the
same speed you can insert into a trivial one, you'll have to adjust your
thinking because that's never going to be true. Removing some indexes
may help; reducing the columns in the index is also good; and some
people end up partitioning their data specifically to help with this
situation. It's also possible to regain some of the earlier performance
using things like REINDEX and CLUSTER.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Stefan Keller 2011-05-24 23:45:47 Re: FW: KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
Previous Message Tom Lane 2011-05-24 20:38:33 Re: Hash Anti Join performance degradation