Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date: 2008-03-31 15:32:43
Message-ID: 28334.1206977563@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Seems like the redirected-dead line pointers are playing spoil-sport here.
> In this particular example, the deleted tuples may get truncated to
> redirected-dead line pointers. Analyze would report them as empty
> slots and not as dead tuples. So in the worst case, if all the deleted
> tuples are already truncated to redirected-dead line pointers, analyze
> may report "zero" dead tuple count.

[ Please see if you can stop using the "redirected dead" terminology ]

Yeah, I think I agree. The page pruning code is set up so that changing
a line pointer to DEAD state doesn't change the count of dead tuples in
the table, so we are counting unreclaimed DEAD pointers as still being
dead tuples requiring VACUUM. ANALYZE should surely not affect that.

It looks like there's no trivial way to get ANALYZE to do things that
way, though. heap_release_fetch() doesn't distinguish a DEAD line
pointer from an unused or redirected one. But in the current
implementation of ANALYZE there's really no benefit to using
heap_release_fetch anyway --- it always examines all line pointers
on each selected page, so we might as well rewrite it to use a simple
loop more like vacuum uses.

I notice that this'd leave heap_release_fetch completely unused...
at least in HEAD I'd be tempted to get rid of it and restore heap_fetch
to its former simplicity.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Colin Wetherbee 2008-03-31 15:54:14 Re: [pgsql-general] Daily digest v1.8030 (22 messages)
Previous Message Pavel Stehule 2008-03-31 15:29:25 Re: Array operator "sum array values"

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2008-03-31 15:33:42 Fwd: WIP: CASE statement for PL/pgSQL
Previous Message Tom Lane 2008-03-31 14:58:46 Re: first time hacker ;) messing with prepared statements