Skip site navigation (1) Skip section navigation (2)

Re: lazy_update_relstats considered harmful (was Re: [PERFORM] pg_autovacuum not having enough suction ?)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>,pgsql-hackers(at)postgreSQL(dot)org,Otto Blomqvist <o(dot)blomqvist(at)secomintl(dot)com>,pgsql-performance(at)postgreSQL(dot)org
Subject: Re: lazy_update_relstats considered harmful (was Re: [PERFORM] pg_autovacuum not having enough suction ?)
Date: 2005-03-25 22:35:58
Message-ID: 14073.1111790158@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Fri, 2005-03-25 at 15:22 -0500, Tom Lane wrote:
>> 2. Dead tuples don't have that much influence on scan costs either, at
>> least not once they are marked as known-dead.  Certainly they shouldn't
>> be charged at full freight.

> Yes, minor additional CPU time, but the main issue is when the dead
> tuples force additional I/O.

I/O costs are mostly estimated off relpages, though, not reltuples.
The only time you really pay through the nose for a dead tuple is when
an indexscan visits it, but with the known-dead marking we now do in
btree indexes, I'm pretty sure that path is seldom taken.

>> It's possible that there'd be some value in adding a column to pg_class
>> to record dead tuple count, but given what we have now, the calculation
>> in lazy_update_relstats is totally wrong.

> Yes, thats the way. We can record the (averaged?) dead tuple count, but
> also record the actual row count in reltuples.

What I'd be inclined to record is the actual number of dead rows removed
by the most recent VACUUM.  Any math on that is best done in the
planner, since we can change the logic more easily than the database
contents.  It'd probably be reasonable to take half of that number as
the estimate of the average number of dead tuples.

But in any case, that's for the future; we can't have it in 8.0.*, and
right at the moment I'm focusing on what to push out for 8.0.2.

> We definitely need to record the physical and logical tuple counts,
> since each of them have different contributions to run-times.

There isn't any difference, if you are talking about fully dead tuples.
It would be possible for VACUUM to also count the number of
not-committed-but-not-removable tuples (ie, new from still-open
transactions, plus dead-but-still-visible-to-somebody), but I'm not sure
that it would be useful to do so, because that sort of count is hugely
transient.  The stat would be irrelevant moments after it was taken.

			regards, tom lane

In response to

pgsql-performance by date

Next:From: Tom LaneDate: 2005-03-25 23:06:02
Subject: Re: pg_autovacuum not having enough suction ?
Previous:From: Simon RiggsDate: 2005-03-25 22:20:23
Subject: Re: lazy_update_relstats considered harmful (was Re: [PERFORM]

pgsql-hackers by date

Next:From: Alvaro HerreraDate: 2005-03-25 22:54:35
Subject: HeapTupleSatisfiesUpdate missing a bet?
Previous:From: Simon RiggsDate: 2005-03-25 22:20:23
Subject: Re: lazy_update_relstats considered harmful (was Re: [PERFORM]

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group