Re: Why does my DB size differ between Production and DR? (Postgres 8.4)

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: Aleksey Tsalolikhin <atsaloli(dot)tech(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why does my DB size differ between Production and DR? (Postgres 8.4)
Date: 2011-02-02 13:46:36
Message-ID: 20110202084636.97bc0ab5.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to Aleksey Tsalolikhin <atsaloli(dot)tech(at)gmail(dot)com>:

> On Mon, Jan 31, 2011 at 7:52 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
> > On Mon, Jan 31, 2011 at 5:54 PM, Aleksey Tsalolikhin
> > <atsaloli(dot)tech(at)gmail(dot)com> wrote:
> >> Situation:  Disk usage on production server root filesystem is at 68%
> >> utilization (80 GB used), on DR is at 51% (56 GB used).   We use
> >> SlonyII-1.2.x to keep the DR up to date.  I would like to account for
> >> the 24 GB difference.
> >
> > This is likely free space in your database.  Some of it is completely
> > normal and actually improves performance.  Too much and your db is
> > bloated and things starting taking too long.
>
> Thanks, Scott!
>
> Bucardo's "check_postgres.pl --action bloat" complains about one table,
> 1 GB wasted. So the other tables must be OK.
>
> So what about my DR, which doesn't have this same 20+ GB of "free space".
> Will it acquire it once it goes into production? Will performance be impacted
> as it acquires the free space? Should I even be concerned about the
> difference in disk usage or is it normal and expected?

Difference in free space from master to slaves is typical. Transactions
run on the slaves differently than on the master. For example, if you
rollback transactions on the master, that can bloat tables, but those
activities are never communicated to the slaves because the rollback
doesn't alter any data.

It's also possible that you have different autovacuum configs on the two
different machines (have you checked) or that the hardware isn't the
same, thus one is able to vacuum more successfully than the other,
or that simply the fates have caused vacuum to start at times that it
gets more done on one server than the other.

Do not be afraid of vacuum full. It's not that it's an evil command or
should never be used, etc. It's just something that has consequences
that you need to be aware of, such as:
*) It can take a long time
*) It locks tables while it works on them, thus it blocks other processes
from accessing those tables
*) It can cause index bloat

However, there are mitigating factors:
*) You can tell it which tables to vacuum, thus you can vacuum full one
table at a time to recduce the overall impact
*) It can be interrupted, so if it's taking longer than you're able to
wait, you can cancel it.
*) You can use the REINDEX command to clean up index bloat.

Based on personal experience, and the fact that you have a slony slave to
work with, I recommend the following:

1) On the Slony slave, do the following, timing each step so you have an
estimate of how long they will take on the master
1a) VACUUM the table. This is non-locking and will do some preliminary
work so that VACUUM FULL takes less time.
1b) VACUUM FULL just that table. Slony will be unable to replicate to
the table while the FULL is running, but that's OK, it will catch
up after it's done and the master won't be interrupted.
1c) REINDEX just that table. This will have no effect on the master.
2) Now that you have time estimates for all those steps, add the times
for 1b and 1c together. This is an estimate of how long the master
database will be interrupted while you do maintenance (step 1a does
not interrupt other work going on). Schedule downtime for about 2x
that time, just in case things run a little longer.
3) Run steps 1a - 1c on the master. Start 1a before your maintenance
window starts, with enough time that it should be finished before
your maintenance window.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Rich Shepard 2011-02-02 14:26:10 Re: Importing/Appending to Existing Table
Previous Message gvim 2011-02-02 13:44:32 Privileges for read-only tables with sequence and foreign keys