Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong
Date: 2008-04-02 20:29:31
Message-ID: 13905.1207168171@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Please see the attached patch. One change I made is to hold the SHARE lock
> on the page while ANALYZE is reading tuples from it. I thought it would
> be a right thing to do instead of repeatedly acquiring/releasing the lock.

Bruce pointed out to me off-list that this patch is closely related to
the patch I proposed awhile back for ANALYZE *overcounting* dead tuples.
That one is on the current commit-fest list because we held it over
after this discussion:
http://archives.postgresql.org/pgsql-hackers/2007-11/msg00771.php

On reflection it seems to me that we allowed ourselves to get distracted
by schemes for reducing the error attributable to the uncertain state
of in-doubt tuples. That's still something interesting to think about,
but we forgot the fact that there's a serious problem in 8.3 and the
patches we have would clearly make it better. What I propose therefore
is combining this patch with my older one so that ANALYZE counts
according to the following rules:

REDIRECT line pointer: ignore
DEAD line pointer: count as dead
HEAPTUPLE_LIVE tuple: count as live, include in statistics pool
HEAPTUPLE_DEAD: count as dead
HEAPTUPLE_RECENTLY_DEAD: count as dead
HEAPTUPLE_INSERT_IN_PROGRESS: ignore
HEAPTUPLE_DELETE_IN_PROGRESS: count as live

We might want to adjust these rules later after more thought, but in any
case ANALYZE has to be fixed to be able to distinguish these cases in
the first place. This is better than what we have and is reasonable to
back-patch. Trying to reduce the race conditions for in-doubt tuples
should go on the TODO list.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mark Mielke 2008-04-02 20:34:09 Re: [GENERAL] SHA1 on postgres 8.3
Previous Message Andrew Sullivan 2008-04-02 19:35:34 Re: Can Postgres 8.x start if some disks containing tablespaces are not mounted?

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Mielke 2008-04-02 20:34:09 Re: [GENERAL] SHA1 on postgres 8.3
Previous Message Andrew Dunstan 2008-04-02 19:43:48 Re: Patch queue -> wiki (was varadic patch)