Tom Lane wrote:
>>One thing that is possibly relevant here is that in 8.0 a plain VACUUM
>>doesn't set reltuples to the exactly correct number, but to an
>>interpolated value that reflects our estimate of the "steady state"
>>average between vacuums. I wonder if that code is wrong, or if it's
>>operating as designed but is confusing autovac.
>Now that I think it over, I'm thinking that I must have been suffering
>severe brain fade the day I wrote lazy_update_relstats() (see
>vacuumlazy.c). The numbers that that routine is averaging are the pre-
>and post-vacuum physical tuple counts. But the difference between them
>consists of known-dead tuples, and we shouldn't be factoring dead tuples
>into reltuples. The planner has always considered reltuples to count
>only live tuples, and I think this is correct on two grounds:
>1. The numbers of tuples estimated to be returned by scans certainly
>shouldn't count dead ones.
>2. Dead tuples don't have that much influence on scan costs either, at
>least not once they are marked as known-dead. Certainly they shouldn't
>be charged at full freight.
>It's possible that there'd be some value in adding a column to pg_class
>to record dead tuple count, but given what we have now, the calculation
>in lazy_update_relstats is totally wrong.
>The idea I was trying to capture is that the tuple density is at a
>minimum right after VACUUM, and will increase as free space is filled
>in until the next VACUUM, so that recording the exact tuple count
>underestimates the number of tuples that will be seen on-the-average.
>But I'm not sure that idea really holds water. The only way that a
>table can be at "steady state" over a long period is if the number of
>live tuples remains roughly constant (ie, inserts balance deletes).
>What actually increases and decreases over a VACUUM cycle is the density
>of *dead* tuples ... but per the above arguments this isn't something
>we should adjust reltuples for.
>So I'm thinking lazy_update_relstats should be ripped out and we should
>go back to recording just the actual stats.
>Sound reasonable? Or was I right the first time and suffering brain
In response to
pgsql-performance by date
|Next:||From: Tom Lane||Date: 2005-03-25 21:25:09|
|Subject: Re: Delete query takes exorbitant amount of time |
|Previous:||From: Stephan Szabo||Date: 2005-03-25 21:10:51|
|Subject: Re: Delete query takes exorbitant amount of time|
pgsql-hackers by date
|Next:||From: Lyubomir Petrov||Date: 2005-03-25 21:22:36|
|Subject: Re: Bug 1500|
|Previous:||From: Palle Girgensohn||Date: 2005-03-25 21:06:52|
|Subject: Re: Patch for collation using ICU |