Re: autovacuum next steps, take 2

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Hackers <pgsql-hackers(at)postgresql(dot)org>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: autovacuum next steps, take 2
Date: 2007-02-23 17:21:21
Message-ID: 20070223172121.GW19527@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 23, 2007 at 01:22:17PM -0300, Alvaro Herrera wrote:
> Jim C. Nasby wrote:
> > On Thu, Feb 22, 2007 at 10:32:44PM -0500, Matthew T. O'Connor wrote:
>
> > > I'm not sure this is a great idea, but I don't see how this would result
> > > in large numbers of workers working in one database. If workers work
> > > on tables in size order, and exit as soon as they catch up to an older
> > > worker, I don't see the problem. Newer works are going to catch-up to
> > > older workers pretty quickly since small tables will vacuum fairly quickly.
> >
> > The reason that won't necessarily happen is because you can get large
> > tables popping up as needing vacuuming at any time.
>
> Right.
>
> We know that a table that needs frequent vacuum necessarily has to be
> small -- so maybe have the second worker exit when it catches up with
> the first, or when the next table is above 1 GB, whichever happens
> first. That way, only the first worker can be processing the huge
> tables. The problem with this is that if one of your hot tables grows
> a bit larger than 1 GB, you suddenly have a change in autovacuuming
> behavior, for no really good reason.
>
> And while your second worker is processing the tables in the hundreds-MB
> range, your high-update 2 MB tables are neglected :-(

That's why I'm thinking it would be best to keep the maximum size of
stuff for the second worker small. It probably also makes sense to tie
it to time and not size, since the key factor is that you want it to hit
the high-update tables every X number of seconds.

If we wanted to get fancy, we could factor in how far over the vacuum
threshold a table is, so even if the table is on the larger size, if
it's way over the threshold the second vacuum will hit it.

You know, maybe the best way to handle this is to force both vacuums to
exit after a certain amount of time, probably with a longer time limit
for the first vacuum in a database. That would mean that after
processing a large table for 10 minutes, the first vacuum would
exit/re-evaluate what work needs to be done. That would mean
medium-sized tables wouldn't get completely starved.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message A.M. 2007-02-23 17:26:33 Re: [Monotone-devel] Re: SCMS question
Previous Message Tom Lane 2007-02-23 17:00:51 Re: Proposal for Implenting read-only queries during wal replay (SoC 2007)