Re: Autovacuum and Autoanalyze

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, David Fetter <david(at)fetter(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Autovacuum and Autoanalyze
Date: 2008-09-17 17:44:54
Message-ID: 1221673494.3913.2148.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2008-09-17 at 10:52 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> > Alvaro Herrera wrote:
> >> Why doesn't this new request conflict with that one?
>
> > The problem back then was that a CREATE INDEX was waiting on the
> > autoanalyze to finish, and the autoanalyze took a long time to finish
> > because of vacuum_cost_delay. Now that we have the auto-cancel
> > mechanism, that's not a problem.
>
> Define "not a problem". With auto-cancel, what will happen is that
> whatever work the autoanalyze does will be wasted. It seems to me
> that the current complaint is about background autovacuum/autoanalyze
> wasting cycles during a bulk load, and there's certainly no purer waste
> than an analyze cycle that gets aborted.

OK, but that's an argument against auto-anything, not just against
splitting out autoanalyze and autovacuum.

> I tend to agree with Alvaro that there's not very much of a use case for
> an analyze-only autovacuum mode.

Did he say that? I thought he said "we could do that", what did that mean Alvaro?

I have a customer saying this would be a good thing and I agree. The
roles of Autovacuum and autoanalyze are not exactly matched, so why do
we force them to be run together or not at all? Why not allow
the user to specify whether they want both or not? It's an option, we're
not forcing anyone to do it that way if they don't want to.

> Assuming that we get to the point of
> having a parallelizing pg_restore, it would be interesting to give it an
> option to include ANALYZE for each table it's loaded among the tasks
> that it schedules. (I'm visualizing these commands as being made up by
> pg_restore itself, *not* added to the pg_dump output.) Then you could
> have a reasonably optimal total workflow, whereas allowing autovacuum
> to try to schedule the ANALYZEs can't be.

That doesn't solve all problems, just ones with pg_restore. That's nice
and I won't turn it away, but what will we do about plain pg_dump and
about other table creations and loads?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ron Mayer 2008-09-17 19:03:13 Patch for SQL-Standard Interval output and decoupling DateStyle from IntervalStyle
Previous Message Teodor Sigaev 2008-09-17 16:06:07 Re: text search patch status update?