Re: Re: Storing XML in PostgreSQL

From: "John Gray" <jgray(at)beansindustry(dot)co(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Storing XML in PostgreSQL
Date: 2001-07-25 18:12:28
Message-ID: slumj9.c27.ln@adzuki
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In article <m2elr5b2cx(dot)fsf(at)smaug(dot)polygnosis(dot)com>, gunnar(at)polygnosis(dot)com
(Gunnar =?iso-8859-1?q?R=F8nning?=) wrote:

> Do you have any documentation on your C functions ? I'm just interested
> in knowing what functions they provide.
>

There are only two (so far). They're very basic. I have:

pgxml_parse(text) returns bool
-parses the provided text and returns true or false if it is
well-formed or not.

pgxml_xpath(text doc, text xpath, int n) returns text
-parses doc and returns the cdata of the nth occurence of
the "XPath" listed. This does handle relative and absolute paths
but nothing else at present. I have a few variants of this.

So, given a table docstore:

Attribute | Type | Modifier
-----------+---------+----------
docid | integer |
document | text |

containing documents such as:

<?XML version="1.0"?>
<site provider="Foundations" sitecode="ak97" version="1">
<name>Church Farm, Ashton Keynes</name>
<invtype>watching brief</invtype>
<location scheme="osgb">SU04209424</location>
</site>

I can type:
select docid,
pgxml_xpath(document,'/site/name',1) as sitename,
pgxml_xpath(document,'/site/location',1) as location
from docstore;

and I get:

docid | sitename | location
-------+-----------------------------+------------
1 | Church Farm, Ashton Keynes | SU04209424
2 | Glebe Farm, Long Itchington | SP41506500
(2 rows)

The next thing is to use the "function as tuple source" support which is
underway in order to allow the return of a list (in the DTD I'm using
-and doubtless many others- certain elements might be repeated, and
I think it would be good to be able to join against all the data from a
particular element.

I hope this helps give a flavour. I'll try and tidy up the functions in the
next couple of days and then I can post what I've got so far. I'm keen to
build on this, as it's part of an (unfunded, unfortunately) project we're
doing. Expat is MIT-licensed so I don't imagine there's a problem linking
it into PostgreSQL.

One aim is to allow people to set pg functions as the handlers "direct"
from the parser -the catch is that the expat API has lots of handlers
(OK, so most of them are less commonly used), so it's a matter of
working out a) an efficient API for setting handlers on a particular
parser and b) how persistent a parser instance should be (each expat
instance can only do one document). Of course, expat may not be the
best one to use -it would be great to be parser-agnostic and use SAX
with a java parser, but I don't think we have java as a language for
user functions yet :-)

Incidentally, I'll be changing my email address over the next couple
of daysto jgray(at)azuli(dot)co(dot)uk -just so you can follow this thread after
I've done that....

Regards

John
Azuli IT

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-07-25 22:48:04 Re: Bug in psql tab completion
Previous Message Howard Williams 2001-07-25 16:11:55 Can Postgres handle 2-phase commits ?