Re: Bitmap scans vs. the statistics views

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bitmap scans vs. the statistics views
Date: 2005-10-06 02:50:18
Message-ID: 11852.1128567018@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Quite some time ago I complained about the fact that bitmap index scans
weren't being counted sanely by the statistics mechanism:
http://archives.postgresql.org/pgsql-hackers/2005-04/msg00675.php
That discussion trailed off without deciding how to fix it, but we
really can't let this go without fixing it in 8.1.

I studied the code some more and realized that we had been operating
under some fundamental misconceptions. The distinction made in the
existing stats code between "tuples fetched" and "tuples returned" has
nothing whatever to do with live vs. dead tuples --- all these counts
are made only after determining that a tuple is visible. The way it
really works in 8.0 is:

table "tuples_returned": tuples returned by heap_getnext, ie,
live tuples found by seqscans
table "tuples_fetched": tuples returned by heap_fetch under
conditions other than being invoked by an indexscan
(this covers various random cases like ANALYZE and
TID scans)
index "tuples_fetched": tuples returned by heap_fetch when
invoked by an indexscan on this index
index "tuples_returned": actually, exactly the same as
tuples_fetched.

This possibly explains why the original design of the pg_stat_all_tables
view exposed only two of the seemingly four interesting counts.

I have just committed changes that redefine the counts like this:

table "tuples_returned": same as before, ie,
live tuples found by seqscans
table "tuples_fetched": tuples returned by heap_fetch when
invoked by a bitmap scan (the random other cases
no longer get counted at all)
index "tuples_fetched": same as before, ie, live tuples
fetched by simple indexscans using this index
index "tuples_returned": number of index entries returned
from the index AM, counting both simple and bitmap
scans.

The pg_stat_all_tables view is modified to add the table's
tuples_fetched count to the sum of the per-index tuples_fetched counts,
so that idx_tup_fetch counts both simple and bitmap index scans.
It's possible to break these out by looking at the low-level statistics
functions, however.

With the new definitions you can get some weak information about the
numbers of dead tuples fetched by indexscans, which was not possible
at all before. (It's weak because it's not easy to distinguish
differences due to dead tuples from differences due to bitmap scanning.)
In the earlier discussion, Josh commented that getting stats about dead
tuples probably belongs somewhere else anyway, and I'm inclined to agree
with that; so I don't feel too bad about not having provided more
complete information.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2005-10-06 04:41:26 fixing LISTEN/NOTIFY
Previous Message Junji TERAMOTO 2005-10-06 02:42:32 Re: prefix btree implementation