Re: best paging strategies for large datasets?

From: Louis-David Mitterrand <vindex+lists-pgsql-sql(at)apartia(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: best paging strategies for large datasets?
Date: 2010-05-13 07:43:54
Message-ID: 20100513074353.GA19779@apartia.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, May 12, 2010 at 12:26:17PM -0400, Justin Graf wrote:
> oops typos
> On 5/12/2010 1:41 AM, Louis-David Mitterrand wrote:
> > Hi,
> >
> > I have a large dataset (page 1 at http://www.cruisefish.net/stat.md) and
> > am in the process of developping a pager to let users leaf through it
> > (30K rows).
>
> That's not that big of a record set.

Well for me it's a big one :) But then again it's my first serious web
app.

> > Ideally I'd like to know when requesting any 'page' of data where I am
> > within the dataset: how many pages are available each way, etc.
> >
> > Of course that can be done by doing a count(*) query before requesting a
> > limit/offset subset. But the main query is already quite slow, so I'd
> > like to minimize them.
> >
>
> What do you mean by quite slow??

Like several seconds. I have to cache the results.

> On a 30K record table count() and query speed should not be a problem..

This query is a large multi-join of times series data, not a single
table. And it's not (prematurely :) optimized.

I'm planning a materialized view for it.

> > But I am intrigued by window functions, especially the row_number() and
> > ntile(int) ones.
> >
> > Adding "row_number() over (order by<reverse query>)" to my query will
> > return the total number of rows in the first row, letting my deduce the
> > number of pages remaining, etc. row_number() apparently adds very little
> > cost to the main query.
>
> That will get a sequential number, but you still don't know how many
> records are in the table, limit and offset block that value.
> I don't see how this helps?
>
> Limit and Offset with Total Record count tell us where we are in the
> record set and which page we are on.

Hmm, good to know. I hadn't tried that yet.

> RecordCount/Limit = Number of pages
> CurrentPage = (offset%RecordCount)/Limit

These simple formulas we bill handy.

> to complicate things further what if the site allows user to change the
> number of records displayed per page. The pager logic needs to figure
> out how many records need to be return per page, and what the next and
> previous iterations are. Without the total record count I don't see how
> that is even possible.
>
> I have written pagers in ASP and PHP

Thanks for your input. I now realize I'll have to get a total count in a
separate (cached) query, or else I'll only be able to provide a basic
"previous/next" pager.

Cheers,

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Louis-David Mitterrand 2010-05-13 07:47:13 Re: best paging strategies for large datasets?
Previous Message silly sad 2010-05-13 07:36:53 Re: best paging strategies for large datasets?