Quick Links

Re: Adding a suffix array index

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Troels Arvin <troels(at)arvin(dot)dk>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Adding a suffix array index
Date:	2004-11-19 15:45:40
Message-ID:	1100875810.4113.13675.camel@localhost.localdomain
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 2004-11-19 at 10:42, Troels Arvin wrote:
> Hello,
>
> I'm working on a thesis project where I explore the addition of a
> specialized, bioinformatics-related data type to a RDBMS. My choice of
> RDBMS is PostgreSQL, of course, and I've started by adding a "dnaseq" (DNA
> sequence) data type, using PostgreSQL's APIs for type additions.
>
> The idea is to try to make it practical and even "attractive" to work with
> DNA sequences in an RDBMS. My starting goal is to make it viable to work
> with sequences in the 50-500 million base range. Some may think that
> RDBMSes and long chunks of data don't match well. My opinion is that the
> increasing power of computers and RDBMS software should at some point make
> it possible to work with DNA sequences in a "normal" data management
> setting like a RDBMS, instead of solely using stand-alone tools and
> stand-alone data files. Anyways, it's an open question if my hypothesis is
> right.
>

Presumably you know about these?

http://www.ncbi.nih.gov/BLAST/
http://www.ciri.upc.es/cela_pblade/BLAST.htm
http://www.netezza.com/products/bio.cfm

I think you're right, but you'd need to have more than one application
of the data for it to be a convincing argument. Without parallelism,
your best efforts will be to equal the speed of the single-use data
structures used in BLAST.

--
Best Regards, Simon Riggs

In response to

Adding a suffix array index at 2004-11-19 10:42:38 from Troels Arvin

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2004-11-19 15:57:22	Re: Test database for new installs?
Previous Message	Peter Eisentraut	2004-11-19 15:45:32	Re: Test database for new installs?