Re: [HACKERS] Reverse key index

From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL General <pgsql-general(at)postgresql(dot)org>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Reverse key index
Date: 2008-02-04 13:35:07
Message-ID: 20080204133507.GM4201@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Pretty neat. It may be a possible alternative to the use of
the hash index in some applications.

Cheers,
Ken

On Sun, Feb 03, 2008 at 07:13:23PM -0800, Gurjeet Singh wrote:
> Hi All,
>
> I have wanted to create a reverse key index for some time now, and it
> seems that an evening of reading and half a day of efforts finally paid off.
> This is just a proof of concept, and sure, the bit-reversing technique can
> use a native language's power for better results.
>
> I started with the following posts:
>
> http://archives.postgresql.org/pgsql-hackers/2002-01/msg01201.php
> http://archives.postgresql.org/pgsql-hackers/2002-01/msg01225.php
>
> The operator class that is created at the end uses one function to
> populate the index in almost a random manner (reverse binary
> representation). And this op-class provides just one operator to compare the
> values, as opposed to Tom's suggestion ("all the other members would be
> byte-reversed-comparison operators"); this is because if we allow the index
> to use any of these other operators (custom or the built-in ones) for range
> scans, the range's starting value will be found for sure, but because the
> btree index follows the leaf nodes from there on, the results will be
> totally what we never asked for!
>
> The result at the end, INDEX del_i, is an index that helps disperse
> heavy sequential INSERTs from different sessions over to different index
> blocks, reducing index block contention hence improving performance. Also,
> this index can be used of equality operator (but no other operator).
>
> Hackers, of course, comments please. Let me know if I have missed
> something, and if this is not exactly what a user would want!
>
> For fun: If you wish to see how a BTree index performs the comparisons
> and populates the index, just uncomment the 'raise notice' statement in
> rev_int_cmp(). And to compare the bit-reversed mode to the normal mode of
> index population, just replace the contents of declare section with 'rev_a
> int = a; rev_b int = b;' in the declare section. :) have fun.
>
> I have uploaded my original, unedited file from the efforts here. It
> goes to lengths to create functions and operators and what not; may be
> helpful for other noobs chasing operators.
> http://www.geocities.com/gurjeet79/reverse_key_index.sql.txt
>
> Best regards,
>
> PS: I think my signature should be:
> 'Do I LOVE Postgres or what!!'
> OR 'I am in LOVE with Postgres'
> OR 'Postgres is _is_ *is* BEAutiful!'
> OR <you suggest>
>
> --------------- CODE ---------------
>
> --- Support
>
> create or replace function public.reverse_string( str varchar )
> returns varchar
> strict
> immutable
> language plpgsql
> as $$
> declare reversed varchar = '';
> begin
> for i in reverse char_length( str ) .. 1 loop
> reversed = reversed || substring( str from i for 1 );
> end loop;
> return reversed;
> end;
> $$;
>
> create or replace function public.rev_int_cmp( a int, b int )
> returns int
> strict
> immutable
> language plpgsql
> as $$
> declare
> rev_a int = reverse_string( a::bit(32)::varchar )::bit(32)::int;
> rev_b int = reverse_string( b::bit(32)::varchar )::bit(32)::int;
> begin
> -- raise notice 'rev_int_cmp( %, % ) called', a, b;
> if( rev_a < rev_b ) then
> return -1;
> elsif( rev_a > rev_b ) then
> return +1;
> else
> return 0;
> end if;
> end;
> $$;
>
> --- Operator class
>
> drop operator class if exists public.rev_int_ops using btree cascade;
> create operator class public.rev_int_ops for type int using btree as
> operator 3 pg_catalog.=,
> function 1 public.rev_int_cmp( int, int );
>
> --- example
>
> drop table if exists del;
> create table del( a int, b char(128) );
> create index del_i on del( a rev_int_ops );
> insert into del select s, s+1 from generate_series( 1, 1000 ) as s; -- rev
> vacuum full analyze del;
>
> explain
> select * from del;
> explain
> select * from del order by a;
> explain
> select * from del where a = 2; -- should use the reverse index
> explain
> select * from del where a < 200; -- should NOT use the reverse index
>
> truncate del;
>
>
> --
> gurjeet[(dot)singh](at)EnterpriseDB(dot)com
> singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com
>
> EnterpriseDB http://www.enterprisedb.com
>
> 17? 29' 34.37"N, 78? 30' 59.76"E - Hyderabad
> 18? 32' 57.25"N, 73? 56' 25.42"E - Pune
> 37? 47' 19.72"N, 122? 24' 1.69" W - San Francisco *
>
> http://gurjeet.frihost.net
>
> Mail sent from my BlackLaptop device

In response to

Browse pgsql-general by date

  From Date Subject
Next Message jiniusatwork-postgresql 2008-02-04 14:31:44 Re: Performance problems with Postgresql/ZFS/Non-global zones on Solaris?
Previous Message Alvaro Herrera 2008-02-04 13:10:24 Re: PGSQL ERROR: FATAL: terminating connection due to administrator command

Browse pgsql-hackers by date

  From Date Subject
Next Message Ron Mayer 2008-02-04 14:00:01 Re: configurability of OOM killer
Previous Message Andrei Kovalevski 2008-02-04 11:16:23 Re: NULL OR ZERO