Skip site navigation (1) Skip section navigation (2)

Re: [SQL] OFFSET impact on Performance???

From: PFC <lists(at)boutiquenumerique(dot)com>
To: "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: alex(at)neteconomist(dot)com, "Greg Stark" <gsstark(at)mit(dot)edu>,"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>,"Andrei Bintintan" <klodoma(at)ar-sd(dot)net>,pgsql-performance(at)postgresql(dot)org
Subject: Re: [SQL] OFFSET impact on Performance???
Date: 2005-01-27 13:11:13
Message-ID: opsk9iwzf2th1vuj@musicbox (view raw or flat)
Thread:
Lists: pgsql-performance

> for example,  
> http://www.sai.msu.su/~megera/postgres/gist/code/7.3/README.intarray
> see OPERATIONS and EXAMPLE USAGE:
>

	Thanks, I already know this documentation and have used intarray before  
(I find it absolutely fabulous in the right application, it has a great  
potential for getting out of tight situations which would involve huge  
unmanageable pivot or attributes tables). Its only drawback is that the  
gist index creation time is slow and sometimes just... takes forever until  
the disk is full.
	However, it seems that integer && integer[] does not exist :

> SELECT * FROM table WHERE id && int[]

explain analyze select * from temp t where id &&  
( '{1,2,3,4,5,6,7,8,9,10,11,12}'::integer[] );
ERREUR:  L'operateur n'existe pas : integer && integer[]
ASTUCE : Aucun operateur correspond au nom donne et aux types d'arguments.  
Vous devez ajouter des conversions explicites de type.

	I have already used this type of intarray indexes, but you have to create  
a special gist index with the gist__int_ops on the column, and the column  
has to be an array. In my case the column is just a SERIAL PRIMARY KEY,  
and should stay this way, and I don't want to create a functional index in  
array[id] just for this feature ; so I guess I can't use the && operator.  
Am I mistaken ? My index is the standard btree here.
	
	It would be nice if the =ANY() could use the index just like IN does ;  
besides at planning time the length of the array is known which makes it  
behave quite just like IN().

	So I'll use either an EXECUTE'd plpgsql-generated query (IN (....)) ,  
which I don't like because it's a kludge ; or this other solution which I  
find more elegant :

CREATE OR REPLACE FUNCTION tools.array_srf( INTEGER[] )
         RETURNS SETOF INTEGER        RETURNS NULL ON NULL INPUT         
LANGUAGE plpgsql        AS
$$
DECLARE
	_data	ALIAS FOR $1;
	_i		INTEGER;
BEGIN
	FOR _i IN 1..icount(_data) LOOP
		RETURN NEXT _data[_i];
	END LOOP;
	RETURN;
END;
$$;

-----------------------------------------------------------------------------------
explain analyze select * from temp t where id  
=ANY( '{1,2,3,4,5,6,7,8,9,10,11,12}' );
  Seq Scan on "temp" t  (cost=0.00..5165.52 rows=65536 width=8) (actual  
time=0.030..173.319 rows=12 loops=1)
    Filter: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12}'::integer[]))
  Total runtime: 173.391 ms

-----------------------------------------------------------------------------------
explain analyze select * from temp t where id  
IN( 1,2,3,4,5,6,7,8,9,10,11,12 );
  Index Scan using temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey,  
temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey, temp_pkey,
temp_pkey on "temp" t  (cost=0.00..36.49 rows=12 width=8) (actual  
time=0.046..0.137 rows=12 loops=1)
    Index Cond: ((id = 1) OR (id = 2) OR (id = 3) OR (id = 4) OR (id = 5)  
OR (id = 6) OR (id = 7) OR (id = 8) OR (id = 9) OR (id = 10) OR (id = 11)  
OR (id = 12))
  Total runtime: 0.292 ms

-----------------------------------------------------------------------------------
explain analyze select * from temp t where id in (select * from  
tools.array_srf('{1,2,3,4,5,6,7,8,9,10,11,12}'));
  Nested Loop  (cost=15.00..620.20 rows=200 width=8) (actual  
time=0.211..0.368 rows=12 loops=1)
    ->  HashAggregate  (cost=15.00..15.00 rows=200 width=4) (actual  
time=0.160..0.173 rows=12 loops=1)
          ->  Function Scan on array_srf  (cost=0.00..12.50 rows=1000  
width=4) (actual time=0.127..0.139 rows=12 loops=1)
    ->  Index Scan using temp_pkey on "temp" t  (cost=0.00..3.01 rows=1  
width=8) (actual time=0.010..0.012 rows=1 loops=12)
          Index Cond: (t.id = "outer".array_srf)
  Total runtime: 0.494 ms

-----------------------------------------------------------------------------------
explain analyze select * from temp t, (select * from  
tools.array_srf('{1,2,3,4,5,6,7,8,9,10,11,12}')) foo where foo.array_srf =  
t.id;

  Merge Join  (cost=62.33..2824.80 rows=1000 width=12) (actual  
time=0.215..0.286 rows=12 loops=1)
    Merge Cond: ("outer".id = "inner".array_srf)
    ->  Index Scan using temp_pkey on "temp" t  (cost=0.00..2419.79  
rows=131072 width=8) (actual time=0.032..0.056 rows=13 loops=1)
    ->  Sort  (cost=62.33..64.83 rows=1000 width=4) (actual  
time=0.169..0.173 rows=12 loops=1)
          Sort Key: array_srf.array_srf
          ->  Function Scan on array_srf  (cost=0.00..12.50 rows=1000  
width=4) (actual time=0.127..0.139 rows=12 loops=1)
  Total runtime: 0.391 ms

-----------------------------------------------------------------------------------
Note that the meaning is different ; the IN removes duplicates in the  
array but the join does not.



In response to

Responses

pgsql-performance by date

Next:From: Oleg BartunovDate: 2005-01-27 13:44:04
Subject: Re: [SQL] OFFSET impact on Performance???
Previous:From: Greg StarkDate: 2005-01-27 12:57:15
Subject: Re: [SQL] OFFSET impact on Performance???

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group