Re: [SQL] OFFSET impact on Performance???

From: PFC <lists(at)boutiquenumerique(dot)com>
To: alex(at)neteconomist(dot)com
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>, "Andrei Bintintan" <klodoma(at)ar-sd(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [SQL] OFFSET impact on Performance???
Date: 2005-01-26 21:24:34
Message-ID: opsk8a28tcth1vuj@musicbox
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> The problem with this approach is TTFB (Time to first Byte). The
> initial query is very slow, but additional requests are fast. In most
> situations we do not want the user to have to wait a disproportionate
> amount of time for the initial query. If this is the first time using
> the system this will be the impression that will stick with them. I
> guess we could experiment and see how much extra time creating a cache
> table will take...

Do it on the second page then ;)

Seriously :
- If you want to display the result count and page count, you'll need to
do the whole query anyway, so you might as well save the results.
- inserting the result id's in a temp table one by one will be slow, but
you can do this :

select array_accum(id) from temp group by id/20 limit 3;
array_accum
---------------------------------------------------------------
{1,2,4,8,16,17,9,18,19,5,10,11,3,6,12,13,7,14,15}
{32,33,34,35,36,37,38,39,20,21,22,23,24,25,26,27,28,29,30,31}
{40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59}

- a really big search of 131072 results :
create table cache (id serial primary key, value integer[]);
explain analyze insert into cache (value) select array_accum(id) from temp
group by id/100;
Subquery Scan "*SELECT*" (cost=14382.02..17986.50 rows=131072 width=32)
(actual time=961.746..1446.630 rows=1311 loops=1)
-> GroupAggregate (cost=14382.02..16020.42 rows=131072 width=4)
(actual time=961.607..1423.803 rows=1311 loops=1)
-> Sort (cost=14382.02..14709.70 rows=131072 width=4) (actual
time=961.181..1077.662 rows=131072 loops=1)
Sort Key: (id / 100)
-> Seq Scan on "temp" (cost=0.00..2216.40 rows=131072
width=4) (actual time=0.032..291.652 rows=131072 loops=1)
Total runtime: 1493.304 ms

Note that the "SELECT..." part takes 1400 ms, and the INSERT part takes
the rest, which is really small. It's the sort which takes most of the
time, but you'll be doing it anyway to get your results in order, so it
comes free to you. This will generate 1000 pages with 100 results on each.
If your searches yield say 1000 results it'll be perfectly fine and can
target times in the sub-100 ms for caching the results (not counting the
total query time of course !)

Using arrays is the key here, because inserting all the results as
individual rows in the table is gonna be a whole lot slower !

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-01-26 21:39:49 Re: Ideal disk setup for Postgresql 7.4?
Previous Message PFC 2005-01-26 21:10:19 Re: 200 times slower then MSSQL??