Re: autovacuum next steps, take 2

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: autovacuum next steps, take 2
Date: 2007-02-27 16:25:10
Message-ID: 20070227162509.GD29041@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 27, 2007 at 12:54:28AM -0500, Matthew T. O'Connor wrote:
> Jim C. Nasby wrote:
> >On Mon, Feb 26, 2007 at 10:18:36PM -0500, Matthew T. O'Connor wrote:
> >>Jim C. Nasby wrote:
> >>Here is a worst case example: A DB with 6 tables all of which are highly
> >>active and will need to be vacuumed constantly. While this is totally
> >>hypothetical, it is how I envision things working (without the threshold).
> >
> >I fail to see how a simple 6 table case is 'worst case'. It's common to
> >see hundreds of tables, and I've run across more than one database with
> >thousands of tables (think partitioning). In cases like those it's
> >certainly possible, perhaps even likely that you would get many daemons
> >running in the database at one time just from different tables suddenly
> >needing vacuuming and appearing at a higher point in the list than other
> >tables. With 100 ~1G tables getting updates it certainly wouldn't be
> >hard to end up with 10 of those being vacuumed all at the same time.
>
> Yes 6 tables is small, the worst-case part of the example was that all
> the tables would need to be vacuumed constantly. Most databases only
> have a few hot tables. Most tables only need to vacuumed every once in
> a while.

It's not the hot tables that are the issue; it's how many large tables
(hot or not) that can come up for vacuuming in order. For example, if
A-Z are all large tables (ie: a few GB), with A being the largest and Z
the smallest, think about what happens here:

Round 1: A needs vacuuming. Daemon gets to it and starts working.
Round 2: B now needs vacuuming. It's slightly smaller than A, so daemon
2 gets to it.
Round 3: C now needs vacuuming. Daemon 3.
...
Round 26: Z now needs vacuuming. Daemon 26 picks it up.

You now have 26 daemons running in the database.

Now, we can argue about how likely that scenario is, but I don't think
it's relevant. What matters is that it *is* possible, and as long as
that's the case you'd have to have some kind of limit. (While this
simple 26 table example is definitely worst-case, if you've got hundreds
of tables that are all multiple GB in size I think it wouldn't be hard
at all for you to end up with a dozen or more daemons all hammering
away).

> >I do like the idea since it should be easier to tune, but I think we
> >still need some limit on it. Perhaps as a first-pass we could just have
> >a hard limit and log a message and/or set a flag any time we hit it.
> >That would hopefully allow us to get information about how big a problem
> >it really is. We could go one step further and say that the last daemon
> >that can start in a database will only vacuum tables that can be done
> >quickly; that's essentially what we've been talking about, except the
> >limit we've been discussing would be hard-coded at 2.
>
> I'm confused, what limit would be set at 2? The number of concurrent
> workers? I've never said that.

The point I was making is that the proposal about limiting the 2nd
daemon to only processing tables it can do in a short period of time is
akin to setting a limit of only 2 daemons in a database at a time.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2007-02-27 16:30:20 Re: autovacuum next steps, take 2
Previous Message Jim C. Nasby 2007-02-27 16:18:31 Re: autovacuum next steps, take 2