Skip site navigation (1) Skip section navigation (2)

Re: Auto Vacuum

From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Russell Smith <mr-russ(at)pws(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Auto Vacuum
Date: 2004-11-29 02:45:20
Message-ID: (view raw or whole thread)
Lists: pgsql-hackers
Matthew T. O'Connor looked at this fairly closely leading up to 8.0
feature freeze. There was a long discussion earlier this year with respect
to libpq vs. using backend functions directly to vacuum multiple

This should be mandatory reading before proceeding. Also, you want to talk
to Matthew and review his patch (submitted early July, I think).



On Sun, 28 Nov 2004, Bruce Momjian wrote:

> I have added an auto-vacuum TODO item:
> * Auto-vacuum
>         o Move into the backend code
>         o Scan the buffer cache to find free space or use background writer
>         o Use free-space map information to guide refilling
> ---------------------------------------------------------------------------
> Russell Smith wrote:
> > Hi All,
> >
> > I am doing serious thinking about the implementation of Auto Vacuum as part of the backend, Not using libpq, but classing internal functions directly.
> > It appears to me that calling internal functions directly is a better implementation than using the external library to do the job.
> >
> > I know I might be stepping on Matthew's toes, but I don't really want to.  I am a complete newbie to the postgresql code, however I am trying.
> > Vacuum appears to be one of the bigger saw points with administrator having to configure it via scheduled tasks.
> >
> > I have outlined things I have thought about below.  I've surely missed a lot, and am open to feedback.  Others may like the current tuning used
> > by the external autovacuum, however to have stats that are collected at vacuum, and used for later vacuum would mean we don't need the
> > stats collector running.
> >
> > The major autovacuum issues
> >
> > 1. Transaction Wraparound
> > 2. Vacuum of relations
> > 3. Tracking of when to do vacuums
> > 4. Where to store information needed by auto vacuum
> >
> > I would suggest the first step is to replicate the wraparound and relation vacuuming from the current code first. But I could be wrong.
> > Currently there are a lot of tuning options in the external autovacuum, and the best way to run vacuum internally will need to be thought out.
> >
> > 1. Transaction Wraparound
> >
> > It appears from the code, that the best time to do a transaction wraparound db wide vacuum is when the
> > frozenXID has wrapped, TransactionIdPrecedes(myXID, dbform->datfrozenxid).  And is probably the most
> > simple vacuum to implement.
> >
> >
> > 2. Vacuuming of relations
> >
> > Currently, the entire heap must be vacuumed at one time.  I would possible be desireable to have only part of the relation vacuumed at
> > a time.  If you can find out which parts of the relation have the most slack space.  There is a todo item regarding tracking recent deletions
> > so they can be resused.  Some form of this would be helpful to work out what to vacuum.  Performance issues for this type of activity
> > may be a concern.  But I have no experience to be able to make comment on them.  So I welcome yours.
> >
> > 3. Tracking of when to vacuum
> >
> > Current autovacuum relies the stats collector to be running.  I would like to only use the stats if they are available,
> > and have an option to be able to vacuum accurately without having to have stats running.
> >
> > By adding certain histograms, on tuples, filesize and slack space we can guage the time between vacuums
> >
> > number of tuples will show the inserts as opposed to updates.
> > file size will show that the file is growing and by how much between vacuums.
> > slack space will show the delete/updated records.
> >
> > A new guc and relation option would need to be implemented to give a target slack space in a file.
> > this will help to reduce growth in relations if vacuum happens to not run frequently enough.  This information
> > can also inform autovacuum that it should be vacuuming more frequently.  The number would be a percentage,
> > eg 10% of the total file size is allowed to be unused.  Also alter table would allow users to set levels of slackness
> > for each relation.  If the table gets too much more than the target slack space, a "partial" vacuum full could be run
> > to reduce the size of the table by moving tuples at the end of the table to slack space near the beginning and
> > shortening the table length.  It would require a full table lock, but you may be able to space it out, to only do a page
> > at a time.
> >
> > /* target percentage of slack space */
> > vacuum_default_target_slack_space = 10
> >
> >
> > 4. Where to store information required by auto vacuum.
> >
> > Auto vacuum needs somewhere to stop the information it needs about current and future vacuuming.  I am unsure of where
> > to put this.  It appears as if it will have a number of fields.  I feel like pg_class is the best place to put the information, but
> > again I have no idea.
> >
> > That's the best I can do for now.  I can clarify things further if required.
> >
> > Regards
> >
> > Russell.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> >
> --
>   Bruce Momjian                        |
>   pgman(at)candle(dot)pha(dot)pa(dot)us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

In response to

pgsql-hackers by date

Next:From: Jim SeymourDate: 2004-11-29 03:13:06
Subject: Re: [HACKERS] Adding Reply-To: <listname> to Lists configuration ...
Previous:From: Bruce MomjianDate: 2004-11-29 02:29:45
Subject: Re: Auto Vacuum

Privacy Policy | About PostgreSQL
Copyright © 1996-2015 The PostgreSQL Global Development Group