Re: yet another problem in recent builds, GIST this time

From: "Gene Selkov Jr(dot)" <selkovjr(at)mcs(dot)anl(dot)gov>
To: Hannu Krosing <hannu(at)trust(dot)ee>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: yet another problem in recent builds, GIST this time
Date: 1998-10-12 06:39:14
Message-ID: 199810120737.CAA22272@antares.mcs.anl.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hannu Krosing <hannu(at)trust(dot)ee> wrote:

> "Gene Selkov, Jr." <selkovjr(at)mcs(dot)anl(dot)gov> wrote:
> >
> > I think I have another problem similar to what I just reported. This time it is in GiST. It does not
> > complain when it builds the index but any attempt to use the table with exesisting GiST indices causes
> > this error:
>
> You seem to be the only one who actually uses GiST <grin>
>
> During last 2-3 years I have posted the question about the usability
> of GiST indexes to various Postgres lists about 3 times with absolutely
> no reaction, so assumed that they didn't work at all ;(

Maybe I came to know postgres too late, but I got the same feeling shortly after I subscribed to the lists. I had a few people ask about it but I doubt they made it to the point where it becomes useful. Maybe because it is not exactly a plug and play thing.

> Could you point me to any information (FAQs, TFMs, ...) about their
> usage ?

If only you and me ask about it, there would hardly be any FAQ. However, please refer to my earlier postings on the subject. I did make some changes to the code but I believe the description of my experience with GiST is up to date. Not sure how to quote the exact hyperlink (because of frames), but you can search the old pgsql-questions list for 'selkov gist'. The first two messages that come up are the most relevant (those dated Thu, 19 Feb 1998 13:40:18 and Wed, 08 Apr 1998 10:25:11)

> I have been under an impression that the easiest way of adding new
> indexing strategies (I personally need full-text) to postgres would
> be thru GiST, but as I have had no luck in getting them to work as
> they were, I assumed that they were in fact unsupported remnants of
> a long-forgotten project.

This is an abandoned project, but I would be happy to have it preserved in at least the state it was in 6.3.2 and before. It appears screwed up in 6.4.x

To put the long story short, GiST uses the strategies of R-tree, and it is, in fact, a version of R-tree. Its current implementation does not actually allow you to add new strategies, but it helps you reuse those defined for R-tree with various data types, unlike the postgres R-tree itself, which can be only be used used with built-ins, such as 2D geo types (boxes, polygons, etc.). There is an example of a GiST over text in Joe Hellerstein's source, http://selkov-7.mcs.anl.gov/pggist-patched.tgz, it might be close to what you need.

I asked Joe about further development and he told me part of it moved to the project referred to as PREDATOR, http://simon.cs.cornell.edu/Info/Projects/PREDATOR/predator.html

I did not yet look into it, but I was told that PREDATOR is a practical test bed for the most advanced indexing technologies. It is also an open source ORDBMS software. I have no idea what it's worth as a database server, but I was advised that it can be used as a development platform for new types and indexing algorithms (even those developed for other systems, such as postgres).

As to the GiST in postgres, we're on our own here. It is possible to get help from the original developers (in the form of questions and answers), but they are unlikely to do work on it actively.

> I feel happy and revitalised to know the contrary.
>
> --------------
> Hannu Krosing

Likewise, I am pleased to know someone else is thinking about it. Although I am amazed at the rate of progress postgreSQL is making, I wish it remained as science-oriented as it originally was. I believe the extensibility continues to be its major virtue. I witnessed numerous infertile attempts to use commercial business-oriented software for scientific databasing. There is very little you can do with money, int, float, date and text. It is the extensibility of types and access methods that makes any real-world database a gold mine for a researcher. See, for example, this site (http://wit.mcs.anl.gov/EMP/), where I am trying to put together a retrieval interface to the enzymology database, EMP. In particular, this example illustrates the use of extensions indexed with GiST: http://wit.mcs.anl.gov/EMP/select_emp_advanced.cgi?E1.ec_code=ec&E1.ec_code.op=%7E%09is+in+range&E1.ec_code.patt=2.1&ec_code.count=1&T1.text=tax&T1.text.op=%7E*%09matches+regex.%2C+case-insensitive&T1.tex!
t.patt=mammalia%7Crodent%7Cprimat%7Caves&T2.text=phd&T2.text.op=%7E%09matches+regex.%2C+case-sensitive&T2.text.patt=v%7CVM%7CMA%7CKC&T3.text=sl&T3.text.op=%7E*%09matches+regex.%2C+case-insensitive&T3.text.patt=cytosol&text.count=1&N1.seg=pho&N1.seg.op=%7E%09contained+in&N1.seg.patt=7+..+7.5&seg.count=1&constraint=%28N1+%26%26+T2%29+%26+E1+%26+T1+%26+T3&do=Run+the+query

Although this is still a very young project (as far as databasing goes), it is considered to be a unique achievement. Currently, my life depends on it: postgres and extensions are the only tools in their kind that allow me to accomplish my job before I am fired. You don't have to be familiar with enzymology to figure out that this kind of data can't be successfully used with Oracle or Sybase and clones.

Hope this does not scare you off...

--Gene

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas G. Lockhart 1998-10-12 06:53:11 Re: [HACKERS] Parser breakage: "timestamp" has become a reserved word
Previous Message Thomas G. Lockhart 1998-10-12 06:16:50 Re: [HACKERS] postmaster locking issues.