Re: How to best use 32 15k.7 300GB drives?

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Robert Schnabel <schnabelr(at)missouri(dot)edu>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, "david(at)lang(dot)hm" <david(at)lang(dot)hm>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to best use 32 15k.7 300GB drives?
Date: 2011-01-28 17:00:18
Message-ID: AANLkTi=3j3PW=Lkrwu2XiWdS4jCy0TcLLO4L2h3=9U0G@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jan 28, 2011 at 9:39 AM, Robert Schnabel <schnabelr(at)missouri(dot)edu> wrote:
> I can't do outside the database.  So yes, once the upload is done I run
> queries that update every row for certain columns, not every column.  After
> I'm done with a table I run a VACUUM ANALYZE.  I'm really not worried about
> what my table looks like on disk.  I actually take other steps also to avoid
> what you're talking about.

It will still get bloated. If you update one column in one row in pg,
you now have two copies of that row in the database. If you date 1
column in 1M rows, you now have 2M rows in the database (1M "dead"
rows, 1M "live" rows). vacuum analyze will not get rid of them, but
will free them up to be used in future updates / inserts. Vacuum full
or cluster will free up the space, but will lock the table while it
does so.

There's nothing wrong with whole table updates as part of an import
process, you just have to know to "clean up" after you're done, and
regular vacuum can't fix this issue, only vacuum full or reindex or
cluster.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Schnabel 2011-01-28 17:09:53 Re: How to best use 32 15k.7 300GB drives?
Previous Message Robert Schnabel 2011-01-28 16:39:03 Re: How to best use 32 15k.7 300GB drives?