Re: Performance of count(*)

From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance of count(*)
Date: 2007-03-22 19:56:50
Message-ID: 4602DF82.80304@wildenhain.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael Stone schrieb:
> On Thu, Mar 22, 2007 at 02:24:39PM -0400, Merlin Moncure wrote:
>> Tino was saying that rather that build a complete indexing storage
>> management solution that lives outside the database, it is better to
>> do intelligent session management so that you get the simplicity if a
>> two tier client server system but the scalability of a web app.
>
> No, what he was saying was "there's this thing called a cursor". I
> thought there was enough information in the original message to indicate
> that the author knew about cursors. There are certainly pros and cons
> and nuances to different approaches, but Tino's message didn't touch on
> anything that specific.

Sure, the message thread sometimes loose history so I wasnt 100% sure
what the framework really is - although I assumed it could be a web
solution. With stand alone applications you usually have a limited
number of users connecting and they are connected during the session
so you can easily use cursors there.

> And even if you do use some kind of "intelligent session management",
> how many simultaneous cursors can postgres sanely keep track of?
> Thousands? Millions? Tens of Millions? I suspect there's a scalability
> limit in there somewhere. Luckily I don't spend much time in the web
> application space, so I don't need to know. :)

Depending on the application, you can even simulate above situation
with a web framework if you manage session in the web framework
with persistent connections for a limited amount of users to work
the same time (certainly not feasable for a public web shop but for
data management systems for inhouse use). In this case, cursors
would be perfect too.

In any other case I fail to see the advantage in storing "index
data" outside the database with all the roundtripping involved.

If the query is complex and rerunning it for every batch is
expensive, fetching the whole result to the application in
case of users really traversing the complete batch
(How often is that really done? I mean, who browses to an
end of a huge result set?) is costy as well w/o really
benefit.

It would be much more easy and clean imho, in this case
to really fetch the data to session and batch linked
scratch table.

If its fast or you can prepare a batch helper table
with index, you can just select the batch equival
portion of the result.

You dont need extensive session management in the
web application to scroll thru result sets in this
way. This can all be encoded in forms or links.

Regards
Tino

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tino Wildenhain 2007-03-22 20:35:50 Re: Performance of count(*)
Previous Message Tom Lane 2007-03-22 19:40:33 Re: Performance of count(*)