Re: Re: From TODO, XML?

From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: mlw <markw(at)mohawksoft(dot)com>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Frank Ch(dot) Eigler" <fche(at)redhat(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: From TODO, XML?
Date: 2001-07-30 05:43:26
Message-ID: Pine.LNX.4.21.0107301513110.27111-100000@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 30 Jul 2001, mlw wrote:

> Bruce Momjian wrote:
> >
> > > > I would find it very helpful to see a table of what sorts of XML
> > > > functionality each major vendor supports.
> > >
> > > Actually I was thinking of databases of data, not database systems.
> >
> > I think we can go two ways. Allow COPY/pg_dump to read/write XML, or
> > write some perl scripts to convert XML to/from our pg_dump format. The
> > latter seems quite easy and fast.
>

> I have managed to get several XML files into PostgreSQL by writing a parser,
> and it is a huge hassle, the public parsers are too picky. I am thinking that a
> fuzzy parser, combined with some intelligence and an XML DTD reader, could make
> a very cool utility, one which I have not been able to find.

I have had the same problem. The best XML parser I could find was the
gnome-xml library at xmlsoft.org (libxml). I am currently using this in C
to replicate a client's legacy Notes system on to Postgres. In this case I
was lucky in as much as I had some input on the XML namespace etc. XML was
used because they had already designed an XML based dump utility.

However, the way XML is being used is very basic. Only creation of tables,
insert and delete are handled. Libxml works fine with this however,
handling DTD/XML parsing, UTF-8, UTF-16 and iso-8859-1, validation
etc.

The main problem then is that every vendor has a different xml name
space. If people really want to pursue this, the best thing to do would be
to try to work with other open source database developers and design a
suitable XML namespace for open source databases. Naturally, there will be
much contention here about he most suitable this and that. It will be
difficult to get a real spec going and will probably be much more
trouble than it is worth. As such, if this fails, then we cannot expect
Oracle, IBM, Sybase, MS and the rest to ever do it.

Perhaps then it would be sufficient for pg_dump/restore to identify the
name space of a given database dump and parse it according to that name
space. Based on command-line arguments, pg_restore/dump could either
die/ignore/transmogrify instructions in the XML which PG does not support
or recognise. It would also be useful if pg_dump could dump data from
postgres in the supported XML namespaces.

So it essentially comes down to how useful it will be and who has time to
code it up =) (as always).

**Creative Solution**

For those who have too much time on their hands and have managed to
untangle some of the syntax in the W3C XSLT 1.0 specification, how about
an XSL stylesheet to transform an XML based database dump from some third
party into (postgres) SQL. Erk! There would have to be an award for such a
thing ;-).

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karel Zak 2001-07-30 08:38:54 Re: Re: From TODO, XML?
Previous Message Tom Lane 2001-07-30 04:36:58 Re: PostgreSQL7.1 on AIX5L is running with too poor ferformance