RD-Tree, index support for int array (contrib-intarray)

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: <pgsql-hackers(at)postgresql(dot)org>
Cc: <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: RD-Tree, index support for int array (contrib-intarray)
Date: 2001-01-10 14:20:50
Message-ID: Pine.GSO.4.31.0101101716460.7062-101000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I attached archive with contrib-intarray. Please submit it into
7.1 beta source tree.

Here is README
--------------------------------------------------------------
This is an implementation of RD-tree data structure using GiST interface
of PostgreSQL. It has built-in lossy compression - must be declared
in index creation - with (islossy). Current implementation has index support
for one-dimensional array of int4's.
All works was done by Teodor Sigaev (teodor(at)stack(dot)net) and Oleg Bartunov
(oleg(at)sai(dot)msu(dot)su).

INSTALLATION:

gmake
gmake install
-- load functions
psql <database> < _int.sql

EXAMPLE USAGE:

create table message (mid int not null,sections int[]);
create table message_section_map (mid int not null,sid int not null);

-- create indices
CREATE unique index message_key on message ( mid );
CREATE unique index message_section_map_key2 on message_section_map (sid, mid );
CREATE INDEX message_rdtree_idx on message using gist ( sections ) with ( islossy);

-- select some messages with section in 1 OR 2 - OVERLAP operator
select message.mid from message where message.sections && '{1,2}';

-- select messages contains in sections 1 AND 2 - CONTAINS operator
select message.mid from message where message.sections @ '{1,2}';
-- the same, CONTAINED operator
select message.mid from message where '{1,2}' ~ message.sections;

TEST:

subdirectory test contains test suite.
cd ./test
1. createdb TEST
2. psql TEST < ../_int.sql
3. ./create_test.pl | psql TEST
4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries
with/without RD-Tree. Run script without arguments to
see availbale options.

a)test without RD-Tree (OR)
./bench.pl -d TEST -s 1,2 -v
b)test with RD-Tree
./bench.pl -d TEST -s 1,2 -v -r

BENCHMARKS:

Size of table <message>: 200000
Size of table <message_section_map>: 268538

Distribution of messages by sections:

section 0: 73899 messages
section 1: 16298 messages
section 50: 1241 messages
section 99: 705 messages

old - without RD-Tree support,
new - with RD-Tree

+----------+---------------+----------------+
|Search set|OR, time in sec|AND, time in sec|
| +-------+-------+--------+-------+
| | old | new | old | new |
+----------+-------+-------+--------+-------+
| 1| 1.427| 0.215| -| -|
+----------+-------+-------+--------+-------+
| 99| 1.029| 0.018| -| -|
+----------+-------+-------+--------+-------+
| 1,2| 1.829| 0.334| 5.654| 0.042|
+----------+-------+-------+--------+-------+
| 1,2,50,60| 2.057| 0.359| 5.044| 0.007|
+----------+-------+-------+--------+-------+

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Attachment Content-Type Size
contrib-intarray.tar.gz application/octet-stream 10.6 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 2001-01-10 14:43:16 Re: AW: Re: tinterval - operator problems on AIX
Previous Message Oleg Bartunov 2001-01-10 14:16:35