Skip site navigation (1) Skip section navigation (2)

Estimation error in n_dead_tuples

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Estimation error in n_dead_tuples
Date: 2007-02-01 10:43:45
Message-ID: 20070201184722.5AD4.ITAGAKI.TAKAHIRO@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackers
AFAICS, the statistics information "the number of dead tuples"
(n_dead_tuples) has an estimation error.

VACUUM sends a message to stats collector process when it has swept a table.
The stats collector receives the message and sets n_dead_tuples of the table
to zero. However, we can update or delete tuples even if a concurrent vacuum
is running through the table. There might be some dead tuples that were
created after start of the vacuum. Therefore, it's not always correct to set
n_dead_tuples to zero at the end of vacuum. Especially, the error will be
worse when a vacuum takes long time.

The autovacuum sees the stats information to decide when to vacuum.
The error in n_dead_tuples misleads it and vacuum starvation might occur
because we always underestimate the number of dead tuples.


I'm thinking to add "the number of vacuumed tuples" to the message from
vacuum. The stats collector will subtract the value from n_dead_tuples
instead of setting it to zero. This is also needed if we want to make
some kinds of "partial" vacuum methods.

Thoughts?  Is this worth doing, I'd like to implement it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Responses

pgsql-hackers by date

Next:From: Pavel StehuleDate: 2007-02-01 11:08:07
Subject: Re: PL/pgSQL RENAME functionality in TODOs
Previous:From: Jignesh K. ShahDate: 2007-02-01 09:54:24
Subject: Re: fixing Makefile.shlib for solaris/gcc with -m64 flag

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group