Re: autovacuum

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Christopher Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: autovacuum
Date: 2006-02-06 18:37:53
Message-ID: 1139251073.22740.78.camel@state.g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Wed, 2006-02-01 at 20:32, Christopher Browne wrote:
> > This seems maybe a bit overkill to me. I think what would be more useful
> > is if autovacuum could execute more than one vacuum at a time, and you
> > could specify tables that are high priority (or possibly just say that
> > all tables with less than X live tuples in them are high priority). That
> > way a longer-running vacuum on a large table wouldn't prevent more
> > vacuum-sensative tables (such as queues) from being vacuumed frequently
> > enough.
>
> Actually, I can think of a case for much the opposite, namely to want
> to concurrently vacuum some LARGE tables...
>
> Suppose you have 2 rather big tables that get updates on similar
> schedules such that both will have a lot of dead tuples at similar
> times.
>
> And suppose both of these tables are Way Large, so that they take
> six hours to vacuum.
>
> I could argue for kicking off vacuums on both, at the same moment;
> they'll both be occupying transactions for 1/4 of a day, and, with
> possibly related patterns of updates, doing them one after the other
> *wouldn't* forcibly get you more tuples cleaned than doing them
> concurrently.
>
> I'm not sure that's a case to push for, either, as something
> pg_autovacuum is smart enough to handle; I'm just putting out some
> ideas that got enough internal discussion to suggest they were
> interesting enough to let others consider...

This could be a big win on databases where those two tables were on
different table spaces, since vacuum now wouldn't be fighting for the
same thin I/O stream twice.

If the autovacuum daemon scheduled vacuums so that each tablespace had a
list of vacuums to run, but then ran those sets in parallel (i.e.
tablespace1 has one single vacuum running though a list while
tablespace2 has its own single vacuum.)

Maybe even a setting that told it the max number to run in parallel for
each tablespace. After all, a tablespace running on 30 hard drives in a
RAID-10 could handly several concurrent vacuums, while another
tablespace running on a single drive would be well limited to one vacuum
at a time.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Vivek Khera 2006-02-06 20:00:12 Re: Default autovacuum settings too conservative
Previous Message Tom Lane 2006-02-06 18:05:49 Re: Actual expression of a constraint

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2006-02-06 18:55:20 Re: [HACKERS] Krb5 & multiple DB connections
Previous Message Tom Lane 2006-02-06 18:29:10 Re: slow information schema with thausand users, seq.scan pg_authid