Re: Encoding problems in PostgreSQL with XML data

From: "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Encoding problems in PostgreSQL with XML data
Date: 2004-01-09 21:35:08
Message-ID: 303E00EBDD07B943924382E153890E5434AA4A@cuthbert.rcsinc.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan wrote:
> I think I agree with Rod's opinion elsewhere in this thread. I guess
the
> "philosophical" question is this: If 2 XML documents with different
> encodings have the same canonical form, or perhaps produce the same
DOM,
> are they equivalent? Merlin appears to want to say "no", and I think I
> want to say "yes".

Er, yes, except for canonical XML. Canonical XML neatly bypasses all
the encoding issues that I can see.

Maybe I am still not getting the basic point, but the part I was not
quite clear on is why the server would need to parse the document at
all, much less change the encoding. Sure, it doesn't necessarily hurt
to do it, but why bother? An external parser could handle both the
parsing and the validation. Reading Peter's post, he seems to be
primarily concerned with an automatic XML validation trigger that comes
built in with the XML 'type'.

*unless*

1. The server needs to parse the document and get values from the
document for indexing/key generation purposes, now the encoding becomes
very important (especially considering joins between XML to non XML data
types).
2. There are plans to integrate Xpath expressions into queries.
3. The server wants to compose generated XML documents from stored
XML/non XML sources, with (substantial) additions to the query language
to facilitate this, i.e. a nested data extraction replacement for psql.

But, since I'm wishing for things, I may as well ask for a hockey rink
in my living room :)

Merlin

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2004-01-09 21:35:28 Re: Translations in the distributions
Previous Message Shachar Shemesh 2004-01-09 20:58:47 Re: OLE DB driver