Skip site navigation (1) Skip section navigation (2)

Re: COUNT & Pagination

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Rajesh Kumar Mallah <mallah(at)trade-india(dot)com>
Cc: David Shadovitz <david(at)www(dot)shadovitz(dot)com>,"pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: COUNT & Pagination
Date: 2004-01-14 21:40:01
Message-ID: Pine.LNX.4.33.0401141436230.25071-100000@css120.ihs.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, 14 Jan 2004, Rajesh Kumar Mallah wrote:

> scott.marlowe wrote:
> 
> >On Tue, 13 Jan 2004, David Shadovitz wrote:
> >
> >  
> >
> >>>We avert the subsequent execution of count(*) by passing the
> >>>value of count(*) as a query parameter through the link in page
> >>>numbers.
> >>>      
> >>>
> >>Mallah, and others who mentioned caching the record count:
> >>
> >>Yes, I will certainly do this.  I can detect whether the query's filter has 
> >>been changed, or whether the user is merely paging through the results or 
> >>sorting* the results.
> >>
> >>I'd love to completely eliminate the cost of the COUNT(*) query, but I guess 
> >>that I cannot have everything.
> >>
> >>* My HTML table column headers are hyperlinks which re-execute the query, 
> >>sorting the results by the selected column.  The first click does an ASC 
> >>sort; a second click does a DESC sort.
> >>    
> >>
> >
> >another useful trick is to have your script save out the count(*) result 
> >in a single row table with a timestamp, and every time you grab if, check 
> >to see if x number of minutes have passed, and if so, update that row with 
> >a count(*). 
> >
> 
> Greetings!
> 
> The count(*) can get evaluated with any arbitrary combination
> in whre clause how do you plan to store that information ?
> 
> In a typical application pagination could be required in n number
> of contexts . I would be interested to know more about this trick
> and its applicability in such situations.
> 
> Offtopic:
> 
> Does PostgreSQL optimise repeated execution of similar queries ie
> queries on same table or set of tables (in a join) with same where clause
>  and only differing in LIMIT and OFFSET.

Yes, and no.

Yes, previously run query should be faster, if it fits in kernel cache.  

No, Postgresql doesn't cache any previous results or plans (unless you use 
prepare / execute, then it only caches the plan, not the query results).

Plus, the design of Postgresql is such that it would have to do a LOT of 
cache checking to see if there were any updates to the underlying data 
between selects.  Since such queries are unlikely to be repeated inside a 
transaction, the only place where you wouldn't have to check for new 
tuples, it's not really worth trying to implement.

Keep in mind most databases can use an index on max(*) because each 
aggregate is programmed by hand to do one thing.  In Postgresql, you can 
create your own aggregate, and since there's no simple way to make 
aggregates use indexes in the general sense, it's not likely to get 
optimized.  I.e. any optimization for JUST max(*)/min(*) is unlikely 
unless it can be used for the other aggregates.



In response to

Responses

pgsql-performance by date

Next:From: Ceri StoreyDate: 2004-01-14 23:10:15
Subject: Join optimisation Quandry
Previous:From: scott.marloweDate: 2004-01-14 21:35:52
Subject: Re: 100 simultaneous connections, critical limit?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group