Re: query performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Cc: "pepone(dot) onrez" <pepone(dot)onrez(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: query performance
Date: 2008-01-14 03:52:25
Message-ID: 18894.1200282745@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> writes:
> On Jan 13, 2008 8:58 PM, pepone. onrez <pepone(dot)onrez(at)gmail(dot)com> wrote:
>> t_documentcontent._id AS _id
>> FROM t_documentcontent LIMIT 50 OFFSET 80000

> with no order by, and possibly no index on t_documentcontent._id,
> there's no choice but a seq scan.

More to the point: a query with a huge OFFSET is *always* going to suck,
because there is no choice but to read through all those records before
getting to the ones you want. You need to fundamentally rethink how you
are going about this. I'm assuming that this is actually just one query
in a series that are intended to eventually fetch the whole table.

One solution is to set up a cursor and FETCH 50 rows at a time from it.
However that requires holding a transaction open, which might not work
well in your environment.

Another possibility, if you have a primary key on the table, is to do
something like

SELECT ... FROM ... WHERE id > ? ORDER BY id LIMIT 50

where you leave out the WHERE clause on the first call, and on
subsequent calls '?' is the last id value seen in the prior call.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message pepone.onrez 2008-01-14 04:27:31 Re: query performance
Previous Message Scott Marlowe 2008-01-14 03:39:59 Re: query performance