Re: Multi-entry indexes (with a view to XPath queries)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "John Gray" <jgray(at)beansindustry(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multi-entry indexes (with a view to XPath queries)
Date: 2001-06-25 20:48:52
Message-ID: 28692.993502132@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"John Gray" <jgray(at)beansindustry(dot)co(dot)uk> writes:
> Firstly, I appreciate this may be a hare-brained scheme, but I've been
> thinking about indexes in which the tuple pointer is not unique.

It sounds pretty hare-brained to me all right ;-). What's wrong with
the normal approach of one index tuple per heap tuple, ie, multiple
index tuples with the same key? It seems to me that your idea will just
make index maintenance a lot more difficult. For example, what happens
when one of the referenced rows is deleted? We'd have to actually
change, not just remove, the index tuple, since it'd also be pointing at
undeleted rows. That'll create a whole bunch of concurrency problems.

> Obviously I need to write a basic XML parser that can support such an
> xpath function, but it would also be good to index by the results of that
> function-i.e. to have an index containing feature type values. As each
> document could have any number of these instances, the number of index
> tuples would differ from the number of heap tuples.

Why would you want multiple index entries for the same key (never mind
whether they are in a single index tuple or multiple tuples) pointing to
the same row?

Actually, after thinking a little more, I suspect the idea you are
really trying to describe here is index entries with finer-than-tuple
granularity. This is not silly, but it is sufficiently outside the
normal domain of SQL that I think you are fighting an uphill battle.
You'd be *much* better off creating a table that has one row per
indexable entity, whatever that is.

> I have tried the approach of decomposing documents into cdata, element and
> attribute tables, and I can use joins to extract a list of feature types
> etc. (and could use triggers to update this) but the idea of not having to
> parse a document to enter it into the database

How do you expect that to happen, when you will have to parse it to get
the index terms?

You might be able to address your problem with two tables, one holding
original documents and one with a row for each indexable entity
(document section). This second one would then have the field index
built on it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Kreen 2001-06-25 22:16:35 Re: [PATCH] by request: base64 for bytea
Previous Message John Gray 2001-06-25 20:17:38 Multi-entry indexes (with a view to XPath queries)