Re: [HACKERS] Slow count(*) again...

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "david(at)lang(dot)hm" <david(at)lang(dot)hm>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [HACKERS] Slow count(*) again...
Date: 2011-02-02 18:47:21
Message-ID: 4D49A6B9.4060608@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Mladen Gogala wrote:
> People are complaining about the optimizer not using the indexes all
> over the place, there should be a way to
> make the optimizer explicitly prefer the indexes, like was the case
> with Oracle's venerable RBO (rules based
> optimizer). RBO didn't use statistics, it had a rank of access method
> and used the access method with the highest
> rank of all available access methods. In practice, it translated into:
> if an index exists - use it.

Given that even Oracle kicked out the RBO a long time ago, I'm not so
sure longing for those good old days will go very far. I regularly see
queries that were tweaked to always use an index run at 1/10 or less the
speed of a sequential scan against the same data. The same people
complaining "all over the place" about this topic are also the sort who
write them. There are two main fallacies at play here that make this
happen:

1) Even if you use an index, PostgreSQL must still retrieve the
associated table data to execute the query in order to execute its
version of MVCC

2) The sort of random I/O done by index lookups can be as much as 50X as
expensive on standard hard drives as sequential, if every block goes to
physical hardware.

If I were to work on improving this area, it would be executing on some
plans a few of us have sketched out for exposing some notion about what
indexes are actually in memory to the optimizer. There are more obvious
fixes to the specific case of temp tables though.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John R Pierce 2011-02-02 19:06:01 Re: Why "copy ... from stdio" does not return immediately when reading invalid data?
Previous Message Dimitri Fontaine 2011-02-02 18:39:49 Re: ALTER EXTENSION UPGRADE, v3

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2011-02-02 18:58:48 Re: Exhaustive list of what takes what locks
Previous Message Robert Haas 2011-02-02 18:32:28 Re: [HACKERS] Slow count(*) again...