Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] encouraging index-only scans

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>,PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2013-09-06 00:14:37
Message-ID: 20130906001437.GA29264@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
On Wed, Sep  4, 2013 at 04:56:55PM -0400, Bruce Momjian wrote:
> > "Add a column pg_class.relallvisible to remember the number of pages
> > that were all-visible according to the visibility map as of the last
> > VACUUM
> > (or ANALYZE, or some other operations that update pg_class.relpages).
> > Use relallvisible/relpages, instead of an arbitrary constant, to
> > estimate how many heap page fetches can be avoided during an
> > index-only scan."
> > 
> > Have I missed some nuance?
> 
> I am looking back at this issue now and I think you are correct.  The
> commit you mention (Oct 7 2011) says ANALYZE updates the visibility map,
> and the code matches that:
> 
> 	    if (!inh)
> 	        vac_update_relstats(onerel,
> 	                            RelationGetNumberOfBlocks(onerel),
> 	                            totalrows,
> -->	                            visibilitymap_count(onerel),
> 	                            hasindex,
> 	                            InvalidTransactionId);
> 
> so if an index scan was not being used after an ANALYZE, it isn't a bad
> allvisibile estimate but something else.  This code was in PG 9.2.

Actually, I now realize it is more complex than that, and worse.  There
are several questions to study to understand when pg_class.relallvisible
is updated (which is used to determine if index-only scans are a good
optimization choice), and when VM all-visible bits are set so heap pages
can be skipped during index-only scans:

	1)  When are VM bits set:
		vacuum (non-full)
		analyze (only some random pages)
	2)  When are massive rows added but VM bits not set:
		copy
	3) When are VM bits cleared:
		insert/update/delete
		vacuum (non-full)
	4)  When are VM map files cleared:
		vacuum full
		cluster
	5) When is pg_class.relallvisible updated via a VM map file scan:
		vacuum (non-full)
		analyze

Vacuums run by autovacuum are driven by n_dead_tuples, which is only
update and delete.  Therefore, any scenario where vacuum (non-full) is
never run will not have significant VM bits set.  The only bits that
will be set will be by pages visited randomly by analyze.

The following table activities will not set proper VM bits:

                vacuum full
                cluster
                copy
		insert-only

If updates and deletes happen, there will eventually be sufficient
reason for autovacuum to vacuum the table and set proper VM bits, and
pg_class.relallvisible.

The calculus we should use to determine when we need to run vacuum has
changed with index-only scans, and I am not sure we ever fully addressed
this.

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


In response to

Responses

pgsql-performance by date

Next:From: Robert HaasDate: 2013-09-06 01:10:06
Subject: Re: [PERFORM] encouraging index-only scans
Previous:From: Tom LaneDate: 2013-09-05 23:50:18
Subject: Re: View with and without ::text casting performs differently.

pgsql-hackers by date

Next:From: Robert HaasDate: 2013-09-06 01:10:06
Subject: Re: [PERFORM] encouraging index-only scans
Previous:From: Bruce MomjianDate: 2013-09-05 22:36:08
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group