Re: Postgres + Xapian (was Re: fulltext searching via a custom index type )

From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: "Eric B(dot)Ridge" <ebr(at)tcdi(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgres + Xapian (was Re: fulltext searching via a custom index type )
Date: 2004-01-02 21:54:29
Message-ID: 20040102215429.GA507@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Thu, Jan 01, 2004 at 11:19:07PM -0500, Eric B.Ridge wrote:

> I couldn't think of a way to create a whole new database type for
> Xapian that could deal with managing 5 btree indexes inside of Postgres
> (other than using tables w/ standard postgres btree index on certain
> fields), so instead, I dug into Xapian and abstracted out it's
> filesystem i/o (open, read, write, etc).
>
> (as an aside, I did spend some time pondering ways to adapt Postgres'
> nbtree AM to handle this, but I just don't understand how it works)

I think your approach is too ugly. You will have tons of problems the
minute you start thinking about concurrency (unless you want to allow
only a single user accessing the index) and recovery (unless you want to
force users to REINDEX when the system crashes).

I think one way of attacking the problem would be using the existing
nbtree by allowing it to store the five btrees. First read the README
in the nbtree dir, and then poke at the metapage's only structure. You
will see that it has a BlockNumber to the root page of the index. Try
modifying that to make it have a BlockNumber to every index's root page.
You will have to provide ways to access each root page and maybe other
nonstandard things (such as telling the root split operation what root
page are you going to split), but you will get recovery and concurrency
(at least to a point) for free.

Hope this helps,

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La espina, desde que nace, ya pincha" (Proverbio africano)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc G. Fournier 2004-01-03 00:29:27 Re: [GENERAL] postgresql.org server problems
Previous Message Luis Neves 2004-01-02 19:07:43 Foodmart2000 sample database

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2004-01-02 21:57:21 Re: [JDBC] PL/Java issues
Previous Message Tom Lane 2004-01-02 20:45:21 Re: cache in plpgsql