Re: vaccuming very large table problem

From: if <zeylienospam(at)gmail(dot)com>
To: Decibel! <decibel(at)decibel(dot)org>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: vaccuming very large table problem
Date: 2008-02-22 07:57:41
Message-ID: f37dba8b0802212357r8c311eby1aa780d42eb2ff1a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Fri, Feb 22, 2008 at 2:30 AM, Decibel! <decibel(at)decibel(dot)org> wrote:
>
> On Feb 15, 2008, at 4:56 AM, if wrote:
> > We use postgresql as a backend to our email gateway, and keep al
> > emails for in database. Using postgres version 7.4.8 (yes, i know it's
> > old), and rather specific table schema (the application was desined
> > that way) -- all emails split into 2kb parts and fed up into
> > pg_largeobject. So, long story short, i now have a catch-22 situation
> > -- database using about 0.7TB and we are running out of space ;-)
> > I can delete some old stuff but i cannot run full vacuum to reclaim
> > disk space (i takes way more than full weekend) and i also cannot
> > dump/restore as there's no free space (2x database)
> >
> > So, with this restrictions aplied, i figured out that i can somehow
> > zero out all old entries in pg_largeobject or even physically delete
> > these files, and rebuild all neccesary indexes.
> >
> > What is the best way to do this?
> > IMO, dd'ing /dev/zero to this files will cause postgres to
> > reinitialize these empty blocks, and after this will still need to
> > vacuum full over 0.7TB, am i right?
> > And if i delete them, then start postmaster, there'll be lots of
> > complaining but will the latest data be saved?
> >
> > How can i delete, for instance, first 70% of data reasonably fast?
>
> You're still inserting new email, right? If so, why are you worried
> about reclaiming space? Just delete some stuff, let vacuum clean it
> up, and make sure that your FSM is big enough (easiest way to do that
> is to run vacuumdb -av).
>
> You'll also want to periodically reindex, especially in 7.4.
>
> And yes, upgrade. At a minimum you need to get to the lastest 7.4,
> which doesn't require anything special.
> --
> Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
> Give your computer some brain candy! www.distributed.net Team #1828
>
>
>

Thank you for answering!

The catch-22 was that we still had about 20G of free space (it's 4-5
days to "X" hour left, speaking in time ), and vacuuming the whole
pg_largeobject would take more than 5 days even under no other load,
and we have plenty of load on worktime, so we couldn't do this anyway.
Deleting stuff was inefficient too, it's just _too_ slow. I still do
not understand schema fully, but there are lots of constraints, and
simple "delete message" action was in fact rescanning all the
database. The index itself on pg_largeobject was using more than 8
gigs of space, so it was a one, big, unbelievable real mess ;-)

Also, this is vendor patched postgres instance, and improve something
without access to the source is way beyond my skills.

Here's how i solved this, if someone interested:
oid2name helped to locate files on disk. i simply got binary from
debian postgres-7.4 package.
i did a quick select and found which files i can simply remove (i.e
relation pg_largeobject resides in 1G files $pgbase/base/16404.x,
where x between 1 and 570), so i just deleted first 400 or so
(previously checked up that there's only old data there), renamed ".x"
part accordingly and started postmaster. Then some
reindexing/vacuuming, and voila ;-)

And yes, upgrading is class "B" priority now. Comparing 8.3 to 7.4.8
is like comparing Postgres to Mysql, imo ;-)

regards,
if

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Vladimir Rusinov 2008-02-22 08:39:20 by-table dumps & restore
Previous Message Milen Evtimov 2008-02-22 07:56:44 server process (PID ) was terminated by signal 29