Re: Query slows after offset of 100K

From: Mark Lewis <mark(dot)lewis(at)mir3(dot)com>
To: Michael Lorenz <mlorenz1(at)hotmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Query slows after offset of 100K
Date: 2008-02-14 20:32:12
Message-ID: 1203021132.9048.100.camel@archimedes
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael,

Our application had a similar problem, and what we did to avoid having
people click into the middle of 750k records was to show the first page
with forward/back links but no link to go to the middle. So people
could manually page forward as far as they want, but nobody is going to
sit there clicking next 37k times. We have several thousand users and
none of them complained about the change. Maybe it's because at the
same time as we made that change we also improved the rest of the
searching/filtering interface. But I think that really people don't
need to jump to the middle of the records anyway as long as you have
decent search abilities.

If you wanted to keep your same GUI, one workaround would be to
periodically update a table which maps "page number" to "first unique
key on page". That decouples the expensive work to generate the page
offsets from the query itself, so if your data changes fairly
infrequently it might be appropriate. Sort of a materialized-view type
approach.

If you can be approximate in your GUI you can do a lot more with this
optimization-- if people don't necessarily need to be able to go
directly to page 372898 but instead would be satisfied with a page
roughly 47% of the way into the massive result set (think of a GUI
slider), then you wouldn't need to update the lookup table as often even
if the data changed frequently, because adding a few thousand records to
a 750k row result set is statistically insignificant, so your markers
wouldn't need to be updated very frequently and you wouldn't need to
store a marker for each page, maybe only 100 markers spread evenly
across the result set would be sufficient.

-- Mark Lewis

On Thu, 2008-02-14 at 19:49 +0000, Michael Lorenz wrote:
> Fair enough, and I did think of this as well. However, I didn't think this was a viable option in my case, since we're currently allowing the user to randomly access the pages (so $lastkey wouldn't really have any meaning). The user can choose to sort on object ID, name or modification time, and then go straight to any page in the list. With 750K records, that's around 37K pages.
>
> Maybe a better way to phrase my question is: how can I paginate my data on 3 different keys which allow random access to any given page, and still get reasonable performance? Should I just force the user to limit their result set to some given number of records before allowing any paginated access? Or is it just not practical, period?
>
> Thanks,
> Michael Lorenz
>
> ----------------------------------------
> > To: mlorenz1(at)hotmail(dot)com
> > CC: pgsql-performance(at)postgresql(dot)org
> > Subject: Re: [PERFORM] Query slows after offset of 100K
> > Date: Thu, 14 Feb 2008 14:08:15 -0500
> > From: tgl(at)sss(dot)pgh(dot)pa(dot)us
> >
> > Michael Lorenz writes:
> >> My query is as follows:
> >> SELECT o.objectid, o.objectname, o.isactive, o.modificationtime
> >> FROM object o
> >> WHERE ( o.deleted = false OR o.deleted IS NULL )
> >> AND o.accountid = 111
> >> ORDER BY 2
> >> LIMIT 20 OFFSET 10000;
> >
> > This is guaranteed to lose --- huge OFFSET values are never a good idea
> > (hint: the database still has to fetch those rows it's skipping over).
> >
> > A saner way to do pagination is to remember the last key you displayed
> > and do something like "WHERE key> $lastkey ORDER BY key LIMIT 20",
> > which will allow the database to go directly to the desired rows,
> > as long as you have an index on the key. You do need a unique ordering
> > key for this to work, though.
> >
> > regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
>
> _________________________________________________________________
> Your Future Starts Here. Dream it? Then be it! Find it at www.seek.com.au
> http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Eseek%2Ecom%2Eau%2F%3Ftracking%3Dsk%3Ahet%3Ask%3Anine%3A0%3Ahot%3Atext&_t=764565661&_r=OCT07_endtext_Future&_m=EXT
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2008-02-14 21:55:03 Re: Query slows after offset of 100K
Previous Message Michael Lorenz 2008-02-14 19:49:22 Re: Query slows after offset of 100K