Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] Database size growing over time and leads to performance impact

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: gnanam(at)zoniac(dot)com
Cc: pgsql-admin(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Database size growing over time and leads to performance impact
Date: 2010-03-29 04:12:53
Message-ID: 4BB028C5.3000507@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-adminpgsql-performance
Please don't cc two of the lists here.  It makes things difficult for 
users who only subscribe to one list or the other who reply--their post 
to the other list will be held for moderation.  And that's a pain for 
the moderators too.  In this case, either the pgsql-admin or 
pgsql-performance list would have been appropriate for this question, 
but not both at the same time.  The suggested approach when unsure is to 
try the most obvious list, and if you don't get a response after a day 
or two then try a second one.

Gnanakumar wrote:
>
> We're using PostgreSQL 8.2.  Recently, in our production database, 
> there was a severe performance impact..  Even though, we're regularly 
> doing both:
>
> 1.     VACUUM FULL ANALYZE once in a week during low-usage time and
>
> 2.     ANALYZE everyday at low-usage time
>  
>
> Also, we noticed that the physical database size has grown upto 30 
> GB.  But, if I dump the database in the form of SQL and import it 
> locally in my machine, it was only 3.2 GB.
>

Most VACUUM problems are caused by not running VACUUM often enough.  A 
weekly VACUUM is really infrequent.  And it's rarely ever a good idea to 
run VACUUM FULL.

You should switch over to running a regular VACUUM, not a full one, on 
something closer to a daily or more frequent basis instead.

> Then while searching in Google to optimize database size, I found the 
> following useful link: 
>
> http://www.linuxinsight.com/optimize_postgresql_database_size.html
>
> It says that even vacuumdb or reindexdb doesn't really compact 
> database size, only dump/restore does because of MVCC architecture 
> feature in PostgreSQL and this has been proven here.
>

That article covers PostgreSQL as of V7.4, and much of it is outdated 
information that doesn't apply to the 8.2 you're running.  It's a pretty 
bad description even of that version.  You should try to forget 
everything you read there and instead look at 
http://www.postgresql.org/docs/8.2/interactive/maintenance.html for an 
accurate introduction to this topic.  I'm sorry you've been misled by it.

> Physical database size was found using the following command:
>
> du -sh /usr/local/pgsql/data/base/<database-oid> 
>
> I also cross-checked this size using 
> "pg_size_pretty(pg_database_size(datname))".
>

You should use the queries shown at 
http://wiki.postgresql.org/wiki/Disk_Usage instead of this, which will 
break down where the disk space is going by table and index.  You will 
discover one of two things:

1) As the database grows, most of the disk space is being taken up by 
the tables themselves.  In this case, a more frequent VACUUM is likely 
to make that go away.  You might also need to bump up one of the 
parameters in the postgresql.conf file, max_fsm_pages

2) Lots of disk space is being taken up by indexes on the tables.  If 
this is the case, the fact that you're running VACUUM FULL all the time 
is the likely cause of your problem.


> Questions
>
> 1.     Is there any version/update of PostgreSQL addressing this issue?
>
> 2.     How in real time, this issues are handled by other PostgreSQL 
> users without taking to downtime?
>
> 3.     Any ideas or links whether this is addressed in upcoming 
> PostgreSQL version 9.0 release?
>
>  
>

PostgreSQL 8.3 turns on a better tuned autovacuum by default so that 
it's more likely VACUUM will run often enough to keep the problem you're 
having from happening.  8.4 removes an additional source of problems 
that can cause VACUUM to stop working.  As of 8.4, most of the problems 
in this area are gone in the default configuration.  Just looking at 
newer versions of the associated documentation will give you an idea 
what's changed; 
http://www.postgresql.org/docs/current/interactive/maintenance.html is 
the 8.4 version.  The problems with VACUUM FULL are so bad that as of 
9.0, the old implementation of that (the one you're probably getting bad 
behavior from) has been replaced by a more efficient one.

The main situation newer PostgreSQL versions can still run into a 
problem where the indexes get large if you're deleting records in some 
ways; http://www.postgresql.org/docs/8.2/static/routine-reindex.html 
describes that issue, and that bit of documentation and the underlying 
behavior is unchanged in later releases.  It's much more likely that 
you're running into the very common situation instead where you're 
running VACUUM FULL infrequently, where you should be running regular 
VACUUM frequently instead.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com   www.2ndQuadrant.us


In response to

pgsql-performance by date

Next:From: Greg SmithDate: 2010-03-29 04:21:34
Subject: Re: Database size growing over time and leads to performance impact
Previous:From: C├ędric VillemainDate: 2010-03-28 21:04:24
Subject: Re: Forcing index scan on query produces 16x faster

pgsql-admin by date

Next:From: Greg SmithDate: 2010-03-29 04:21:34
Subject: Re: Database size growing over time and leads to performance impact
Previous:From: Nilesh GovindarajanDate: 2010-03-29 02:04:12
Subject: Re: Socket & TCP connections

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group