AFAICS, the statistics information "the number of dead tuples"
(n_dead_tuples) has an estimation error.
VACUUM sends a message to stats collector process when it has swept a table.
The stats collector receives the message and sets n_dead_tuples of the table
to zero. However, we can update or delete tuples even if a concurrent vacuum
is running through the table. There might be some dead tuples that were
created after start of the vacuum. Therefore, it's not always correct to set
n_dead_tuples to zero at the end of vacuum. Especially, the error will be
worse when a vacuum takes long time.
The autovacuum sees the stats information to decide when to vacuum.
The error in n_dead_tuples misleads it and vacuum starvation might occur
because we always underestimate the number of dead tuples.
I'm thinking to add "the number of vacuumed tuples" to the message from
vacuum. The stats collector will subtract the value from n_dead_tuples
instead of setting it to zero. This is also needed if we want to make
some kinds of "partial" vacuum methods.
Thoughts? Is this worth doing, I'd like to implement it.
NTT Open Source Software Center
pgsql-hackers by date
|Next:||From: Pavel Stehule||Date: 2007-02-01 11:08:07|
|Subject: Re: PL/pgSQL RENAME functionality in TODOs|
|Previous:||From: Jignesh K. Shah||Date: 2007-02-01 09:54:24|
|Subject: Re: fixing Makefile.shlib for solaris/gcc with -m64 flag|