Skip site navigation (1) Skip section navigation (2)

Re: Performance of count(*)

From: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Steve Atkins <steve(at)blighty(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance of count(*)
Date: 2007-03-22 19:18:24
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Tom Lane wrote:
> "Craig A. James" <cjames(at)modgraph-usa(dot)com> writes:
>> Steve Atkins wrote:
>>> As long as you're ordering by some row in the table then you can do that in
>>> straight SQL.
>>> select a, b, ts from foo where (stuff) and foo > X order by foo limit 10
>>> Then, record the last value of foo you read, and plug it in as X the next
>>> time around.
>> We've been over this before in this forum: It doesn't work as advertised.
>> Look for postings by me regarding the fact that there is no way to tell
>> the optimizer the cost of executing a function.  There's one, for example,
>> on Oct 18, 2006.
> You mean
> ?  I don't see anything there that bears on Steve's suggestion.
> (The complaint is obsolete as of CVS HEAD anyway.)

Mea culpa, it's October 8, not October 18:

The relevant part is this:

"My example, discussed previously in this forum, is a classic.  I have a VERY expensive function (it's in the class of NP-complete problems, so there is no faster way to do it).  There is no circumstance when my function should be used as a filter, and no circumstance when it should be done before a join.  But PG has no way of knowing the cost of a function, and so the optimizer assigns the same cost to every function.  Big disaster.

"The result?  I can't use my function in any WHERE clause that involves any other conditions or joins.  Only by itself.  PG will occasionally decide to use my function as a filter instead of doing the join or the other WHERE conditions first, and I'm dead.

"The interesting thing is that PG works pretty well for me on big tables -- it does the join first, then applies my expensive functions.  But with a SMALL (like 50K rows) table, it applies my function first, then does the join.  A search that completes in 1 second on a 5,000,000 row database can take a minute or more on a 50,000 row database."


In response to


pgsql-performance by date

Next:From: Tom LaneDate: 2007-03-22 19:40:33
Subject: Re: Performance of count(*)
Previous:From: Michael StoneDate: 2007-03-22 18:46:15
Subject: Re: Parallel Vacuum

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group