Re: performance question on VACUUM FULL (Postgres 8.4.2)

From: PG User 2010 <pguser2010(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance question on VACUUM FULL (Postgres 8.4.2)
Date: 2010-01-21 22:43:32
Message-ID: 1e937d501001211443y2464e0v33faceaa90a01ee4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jeff,

Are you running VACUUM (without FULL) regularly? And if so, is that
> insufficient?
>

Unfortunately, we have not run vacuumlo as often as we would like, and that
has caused a lot of garbage blobs to get generated by our application.

You can always expect some degree of bloat. Can you give an exact number
> before and after the VACUUM FULL? Or is this a one-shot attempt that
> never finished?
>
> If large objects are being added/removed regularly, it might be better
> just to wait (and do regular VACUUMs), and the table will naturally
> compact after the rows at the end are removed.
>

Our vacuum full is still running after several days, so I'm unsure when it
will finish (it would be nice to be able to get a rough idea of % complete
for vacuum full, but I don't know of any way to do that). I estimate that
there are probably several million dead blobs taking up ~ 80 gigabytes of
space.

I believe that once we are running vacuumlo regularly, then normal vacuums
will work fine and we won't have much of a wasted space issue. However,
right now we have LOTS of dead space and that is causing operational issues
(primarily slower + larger backups, maybe some other slight performance
issues).

So, here are my questions (maybe I should post these to the -general or
-admin mailing lists?):

1) is there any easy way to fiddle with the vacuum process so that it is not
CPU bound and doing very little I/O? Why would vacuum full be CPU bound
anyway???

2) is it possible to interrupt VACUUM FULL, then re-start it later on and
have it pick up where it was working before?

3) are there any alternatives, such as CLUSTER (which doesn't seem to be
allowed since pg_largeboject is a system table) that would work?

Thanks so much!

Sam

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message tmp 2010-01-21 22:51:00 TPC-C implementation for postgresql?
Previous Message Tory M Blue 2010-01-21 22:15:29 Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL