Re: Dreaming About Redesigning SQL

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: sailesh(at)cs(dot)berkeley(dot)edu
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dreaming About Redesigning SQL
Date: 2003-10-20 05:19:00
Message-ID: 200310192219.00118.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sailesh,

Warning: I get carried away in this response. I'm afraid that I'm a fond
reader of Fabian Pascal and CJ Date, so I have far too much to say on the
topic. So if you really care about XML databases, you should probably hold
off on reading the rest until you're well-caffinated and in a cheerful frame
of mind.

Also, let me clarify that there is a distinction between using XML *as a*
database, and putting XML documents into databases of other types. I find
the latter obvious and sensible, but the former a silly and wrong-headed
idea, and it's the pure-XML-database which I attack below.

If you want to really have this out, I live in San Francisco and I love to
argue. Coffee at Intermezzo? I'll buy.

-------------------------------
> If you look at the academic research work, there have been gazillions
> of recent papers on XML database technology.

Point me to one which presents an algebra, calculus, or other mathematical
underpinning of XML databases, and I will be happy to eat my words on this
list. I can easily find lots of papers using google, but all of them are
about *technical implementation* and do not provide a theoretical
underpinning for XML databases.

A few (such as Dan Suciu's paper) present some theory to back XQuery but it is
presented entirely as an XML-based data access extension to SQL ... a role
which seems fine to me.

Others, even those cited by xmldb.org like the below, have rather lukewarm
things to say on the topic, such as David Mertz, PhD:
(http://www-106.ibm.com/developerworks/library/x-matters8/index.html)

"XML is an extremely versatile data transport format, but despite high hopes
for it, XML is mediocre to poor as a data storage and access format. ..."
<snip>
" ...XML has no inherent mechanism for enforcing constraints of this sort
(DTDs and schemas are constraints of a different, more limited sort). Without
constraints, you just have data, not a data model (to slightly oversimplify
matters). ..." <snip>
" ... In other words, go ahead and be excited by XML's promise of a universal
data transport mechanism, but keep your backend data on something designed
for it, like DB2 or Oracle (or on Postgres or MySQL for smaller-scale
systems)."

And this guy is cited by XMLDB.org? Perhaps not surprising, as among the 5
goals of XMLDB.org, development of a standard theory of XML databases is not
present.

> All the major database
> vendors (Oracle, IBM and Microsoft) are investing fairly heavily in
> core-engine XMLDB technology.

So? Oracle, IBM and Microsoft also have SQL databases that do a terrible job
of upholding the SQL standard, and their (at least Oracle's and Microsoft's)
adherence is getting worse with successive versions rather than better. I
wouldn't look to them for guidance.

If they're spending millions on XML Databases, it's becuase it, however
wrong-headed, is a fad and fads mean sales, and they don't want to take a
chance on missing out. And these companies have backed plenty of useless
technologies before; remember Microsoft's "Periodicals on CD"?

Not that I'm against XML; as far as I'm concerned, for interchangable,
searchable, and archival documents, XML is the greatest thing since sliced
Beatles. I love XML-RPC for pushing data through HTTP, and I will happily
be in the cheering squad for anyone who writes a set of OSS tools to extract
data from XML docs stored in a PostgreSQL database, or to automate
some-standard-XML-to-relational-data-and-back conversion. That is a good
application of XML+Database ideas.

XML databases, on the other hand, are an example of taking a good idea too
far. XML is a great data transmission tool; it's a great document
transformation tool; it's a good way to store documents. It is not,
however, a good database.
------------------------------------------------------
--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sailesh Krishnamurthy 2003-10-20 05:26:50 Re: Dreaming About Redesigning SQL
Previous Message Shridhar Daithankar 2003-10-20 05:08:17 Re: Vacuum thoughts