Re: Proposal for background vacuum full/cluster

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Paul Tillotson <pntil(at)shentel(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal for background vacuum full/cluster
Date: 2005-04-21 04:49:47
Message-ID: 20050421044947.GD58835@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 20, 2005 at 08:10:23PM -0400, Paul Tillotson wrote:
> Jim C. Nasby wrote:
>
> >I talked to a few people on IRC about this and they didn't think I was
> >nuts, so maybe this is something practical...
> >
> >In a nutshell, my idea is to use the normal transactional/XID code to
> >relocate tuples in the heap. Think of doing an UPDATE field=field if you
> >could tell update what page to put the new tuple on.
> >
> Be careful not to fire UPDATE triggers on the tuple while doing so.

To clarify, I'm not suggesting this actually be coded as a bunch of
updates. I used that as an example only.

> Keep in mind that the transaction that does the update can't also vacuum
> it's own tuples. You'd have to end one transaction, then wait until
> every transaction running while the updater ran finishes, then start the
> transaction that vacuums. Obviously your command would need to be able
> to start and end transactions. (Meaning that it can't be a user defined
> function, and it probably can't be a normal self-contained command in
> postgres.)

Yes. Note how I called it a background vacuum full/cluster.

> Are you thinking of coding this, or just suggesting it for others? I
> was thinking of coding something like this but found that I didn't
> understand enough of the internals of how the vacuum command actually
> works to be able to write this. I'd be willing to devote perhaps a few
> hours a week to it if you want to help me.

I certainly don't have enough knowledge right now to code this, but I'd
be willing to help any way I can.

> P. S.
>
> The last time I thought about it, I decided that the best solution is
> probably one that works just like vacuum full except that it scans the
> table in reverse order. It would do something like this:
>
> - Wait for exclusive lock.
That's exactly what I want to avoid. The reality of cluster and vacuum
full is that many (if not most) installs can't use them because of how
they disrupt the system. I'd like a version that doesn't do that.

> - Start at the end of the table -- call this page I.
> - If page I is completely empty, shrink the heap and go to step 1 again.
> (Page I is not empty now.)
> - *Scan forward in the table until you find a page that is empty. (Call
> it J)
> If no such page is found, there is no more free space in the
> table. Exit.
> - Move the tuples from page I to page J.
> - Drop the exclusive lock. and go to step one.

Same basic idea. I haven't gone into specific details because I want to
see how feasable it is. And since I can't code it myself the best I can
hope for is a TODO; and IMO I shouldn't try and tell whoever takes that
TODO how exactly to make this work.
--
Jim C. Nasby, Database Consultant decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-04-21 04:54:38 Re: Postgres: pg_hba.conf, md5, pg_shadow, encrypted passwords
Previous Message Jim C. Nasby 2005-04-21 04:43:53 Re: Proposal for background vacuum full/cluster