Re: Cleaning up unreferenced table files

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Cleaning up unreferenced table files
Date: 2005-04-26 15:40:36
Message-ID: Pine.OSF.4.61.0504261820040.260117@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Mon, 25 Apr 2005, Bruce Momjian wrote:

> Tom Lane wrote:
...
>> I think though that we ought to first consider the question of whether
>> we *want* this functionality. On reflection I'm fairly nervous about
>> the idea of actually deleting anything during startup --- seems like a
>> good recipe for turning small failures into large failures. But if
>> we're not going to delete anything then it's questionable whether we
>> need to code it like this at all. It'd certainly be easier and safer to
>> examine these tables after the system is up and running normally.
>
> Let's discuss this. The patch as submitted checks for unreferenced
> files on bootup and reports them in the log file, but does not delete
> them. That seems like the proper behavior. I think we delete from
> pgsql_tmp on bootup, but we _know_ those aren't referenced.
>
> What other user interface would trigger this if we did it after startup?
> Wouldn't we have to lock pg_class against VACUUM while we scan the file
> system, and are we sure we do things in pg_class or the file system
> first consistently? It seems much more prone to error doing it while
> the system is running.

I agree.

Also, you can only have stale files after a backend crash, since they are
normally cleaned up at the end of transaction. If it was a separate
program or command, the administrator would have to be aware
of the issue. Otherwise, he wouldn't know he needs to run it after a
crash.

I feel that crashes that leaves behind stale files are rare. You
would need an application that creates/drops tables as part of normal
operation. Some kind of a large batch load might do that: BEGIN, CREATE
TABLE foo, COPY 1 GB of data, COMMIT.

The nasty thing right now is, you might end up with 1 GB of wasted disk
space, and never even know it.

> I guess I am happy with just reporting during startup like the patch
> does now.

Ok. I'll fix the design issues Tom addressed earlier, add documentation,
and resubmit.

We can come back to this after a release or two, when we have more
confidence in the feature. Maybe we'll also get some feedback on how often
those stale files occur in practice.

- Heikki

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2005-04-26 15:46:45 Re: [HACKERS] Continue transactions after errors in psql
Previous Message Tom Lane 2005-04-26 15:19:51 Re: [HACKERS] Continue transactions after errors in psql