Re: Autovacuum in the backend

From: Alvaro Herrera <alvherre(at)surnet(dot)cl>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Qingqing Zhou <zhouqq(at)cs(dot)toronto(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Autovacuum in the backend
Date: 2005-06-16 03:55:47
Message-ID: 20050616035547.GA14519@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Jun 15, 2005 at 11:42:17PM -0400, Matthew T. O'Connor wrote:
> Alvaro Herrera wrote:
>
> >A question for interested parties. I'm thinking in handling the
> >user/password issue by reading the flat files (the copies of pg_shadow,
> >pg_database, etc).
> >
> >The only thing that I'd need to modify is add the datdba field to
> >pg_database, so we can figure out an appropiate user for vacuuming each
> >database.
>
> I probably don't understand all the issue involved here but reading
> pg_shadow by hand seems problematic. Do you constantly re-read it?
> What happens when a new user is added etc....

You don't read the pg_shadow table. Rather, you read the pg_user file,
which is a plain-text file representing the information in pg_shadow.
It's kept up to date by backends that modify user information. Likewise
for pg_database and pg_group.

> Can't autovacuum run as a super-user that can vacuum anything?

That'd be another way to do it, maybe simpler.

Currently I'm working on separating this in two parts though, one being
a shlib and other the standard postmaster-launched backend process. So
I don't have to address this issue right now. It just bothered me to
need a separate file with username and password, and the corresponding
code to read it.

One issue I do have to deal with right now is how many autovacuum
processes do we want to be running. The current approach is to have one
autovacuum process. Two possible options would be to have one per
database, and one per tablespace. What do people think?

I'm leaning for the simpler option myself but I'd like to hear more
opinions. Particularly since one-per-database makes the code a lot
simpler as far as I can see, because the shlib only needs to worry about
issuing VACUUM commands; with the other approaches, the shlib has to
manage everything (keep the pg_autovacuum table up to date, figuring out
when vacuums are needed, etc.)

The main problem with the one-per-database is that we wouldn't have a
(simple) way of coordinating vacuums so that they don't compete for I/O.
That's why I thought of the one-per-tablespace approach, though that one
is the most complex of all.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Un poeta es un mundo encerrado en un hombre" (Victor Hugo)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gavin Sherry 2005-06-16 03:56:41 Re: Autovacuum in the backend
Previous Message Josh Berkus 2005-06-16 03:45:45 Re: Autovacuum in the backend

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2005-06-16 03:56:41 Re: Autovacuum in the backend
Previous Message Josh Berkus 2005-06-16 03:45:45 Re: Autovacuum in the backend