Re: COUNT(*) and index-only scans

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Bruce Momjian" <bruce(at)momjian(dot)us>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COUNT(*) and index-only scans
Date: 2011-10-10 19:15:07
Message-ID: 4E92FDEB0200002500041CF2@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Right now, our costing model for index-only scans is pretty dumb.
> It assumes that using an index-only scan will avoid 10% of the
> heap fetches. That could easily be low, and on an insert-only
> table or one where only the recently-updated rows are routinely
> accessed, it could also be high.

As a reality check, I just ran this query on a table in a statewide
copy of our data:

select count(*),
sum(case when xmin = '2'::xid then 0 else 1 end) as read_heap
from "CaseHist";

and got:

count | read_heap
-----------+-----------
205765311 | 3934924

So on our real-world database, it would skip something on the order
of 98% of the heap reads, right?

> This isn't just an exercise in costing, though: right now, we
> don't even generate a plan to use an index for a full-table scan,
> because we assume that it can never be cheaper. This is actually
> not quite true even in previous releases (suppose the table is
> severely bloated but the index is not) and it's going to be less
> true now that we have index-only scans. So that's going to need
> some adjustment, too.

OK. Thanks for clarifying.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2011-10-10 19:16:10 table/index options | was: COUNT(*) and index-only scans
Previous Message Robert Haas 2011-10-10 19:02:36 Re: ALTER EXTENSION .. ADD/DROP weirdness