Re: Adding a suffix array index

From: Troels Arvin <troels(at)arvin(dot)dk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Adding a suffix array index
Date: 2004-12-03 15:31:48
Message-ID: pan.2004.12.03.15.31.48.161260@arvin.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 28 Nov 2004 17:53:38 -0500, Tom Lane wrote:
>> But is it cheaper, IO-wise to "jump" around in an index than to go back
>> and forth between index and tuple blocks?
>
> Perhaps not --- but why would you be "jumping around"? Wouldn't the
> needed info appear in consecutive locations in the index?

Searching for a match, using a suffix array entails a binary search in the
suffix array (can be optimized through the use of a longest-common-prefix
helper-array). So some amount of "jumping" is needed.

Anyhow: I've given up trying to create the suffix array as a "normal"
index type. As I'm running out of time, I'm inclined to hacks. I'm
considering storing the index of a sequence in a large object which I then
store a reference to in the data item: The large object interface seems
like something I could use.

Or I might store dnaseq values as
- some meta-information, perhaps (like a hash value)
- a reference to a large object containing the sequence
- a reference to a large object containing the suffix array
- a reference to a large object containing a helper-array
(longest common prefix-information)

One problem with this approach is that the related, large objects will not
automatically be removed when a value is removed from a table (but that
could probably be somewhat fixes using a trigger). Beyond being somewhat
ugly: Is it possible?

How much of[1] is still the case today? Are today's large objects somewhat
corresponding to the article's description of "v-segments"?
The article mentions that POSTGRES supported a CREATE LARGE TYPE
construct. Am I right in assuming that today's corresponding construct is
as exemplified in the manual:
CREATE TYPE bigobj (INPUT = lo_filein, OUTPUT = lo_fileout,...)

Reference 1:
Stonebraker & Olson: Large Object Support in POSTGRES (1993)
http://epoch.cs.berkeley.edu:8000/postgres/papers/S2K-93-30.pdf

--
Greetings from Troels Arvin, Copenhagen, Denmark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-12-03 16:26:12 Re: Adding a suffix array index
Previous Message Tom Lane 2004-12-03 15:22:01 Re: 8.0RC1 tomorrow