Re: GiST opclass and varlena

From: Dragan Zubac <zubac(at)vlayko(dot)tv>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: GiST opclass and varlena
Date: 2008-03-25 16:57:11
Message-ID: 47E92EE7.7030704@vlayko.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

Not so familiar with all this math :) ,but here's my solution,which I
must admit keep things going at the moment:

db=> \d prefix
Table "public.prefix"
Column | Type | Modifiers
----------+---------+-----------------------------------------------------
id | bigint | not null default nextval('prefix_id_seq'::regclass)
prefix | text |
operator | integer |
Indexes:
"prefix_pkey" PRIMARY KEY, btree (id)
"prefix_index" UNIQUE, btree (prefix)

and we're using a procedure to match prefices (longest prefix
match),with simething like:

...

-- CHECK PREFIX START

while tmp_length <= char_length(d_number) loop

-- take the number and try to find it in prefix table
-- if not found,decrease it by removing last number
-- and try again

tmp_dest_number := substring (d_number from 1 for tmp_length);

select into operator_temp operator from prefix
where prefix=tmp_dest_number;

if not found then
tmp_length := tmp_length + 1;
else

-- if we have a match with some prefix
-- take the operator from that row

operatorfound := true;
operator_out := operator_temp;
exit;
end if;

end loop;

-- CHECK PREFIX STOP

....

only 'semantic' problem You might have with this approach is that number
is like 16511xxxxx,which belongs to some Hawaii island operator :),but
the problem is that all You have in Your prefix table is 16xxxxxx,which
You mark to belong to operator USA Something. In that case,Your system
will think of 16511xxxxxx number as it belongs to USA Something operator
and not Hawaii island operator :( Only solution to this is to always
have up-to-date prefix table,and populate it even with the prefices Your
system does not support,because Your system then reject that number and
it will have a good/precise reason why he did it :)

Some poorly measurement showed some 60-80 matching/sec with this
algorithm of matching prefices and a couple of concurrent database
connections.

Sincerely

Dragan

Dimitri Fontaine wrote:
> Hi,
>
> I'm trying to code a GiST opclass to index prefix searches (select ... from t
> where t.prefix @> query), now using a prefix_range datatype. This datatype is
> a varlena one, and storing it to disk and indexing it with BTrees work ok,
> but I'm failing to have my GiST opclass working, here's the problem:
>
> postgres=# create index idx_prefix on ranges using gist(prefix
> gist_prefix_range_ops);
> NOTICE: gpr_picksplit(): entryvec->n= 234 maxoff= 232 l= 176 r= 56 l+r= 232
> unionL='01[0-7]' unionR='01[4-7]'
> NOTICE: gpr_picksplit(): v->spl_ldatum='01[0-7]' v->spl_rdatum='01[4-7]'
> ERROR: invalid memory alloc request size 3049878020
>
> The code is available at pgfoundry here:
> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/prefix/prefix/
>
> The previous support for prefixes as text is still there (but will get
> deprecated soon --- or so I hope), and the new datatype and its usage not
> well commented nor documented currenlty. If this show up as a requirement to
> get your attention, please state it and I'll work on documenting prefix_range
> first.
>
> I'm looking for some help on how to resolve the shown index creation problem,
> which I think is related to how I give data to GiST in its spl_ldatum and
> spl_rdatum from the user defined picksplit() method, lines 1101 and 1102 in
> prefix.c (version 1.26).
>
> Regards,
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-03-25 17:12:36 PG East and Washington DC tour
Previous Message Bruce Momjian 2008-03-25 16:52:01 Re: [pgsql-www] New email list for emergency communications