Re: returning PGresult as xml

From: Scott Lamb <slamb(at)slamb(dot)org>
To: Brian Moore <brianmooreca(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: returning PGresult as xml
Date: 2004-01-27 12:58:23
Message-ID: 7D909592-50C8-11D8-B400-000A95891440@slamb.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan 25, 2004, at 3:07 AM, Brian Moore wrote:
> it's been said that converting a PGresult into xml is "trivial" and
> that's why it hasn't been done in the codebase as of yet. i have seen
> much code that writes xml, and many mistakes are made. most often
> improper escaping, or writing to a schema/DTD that has not been
> well-thought out. the transformation into xml is not difficult, but it
> does require attention to detail.

The escaping, at any rate, is trivial if you use a proper API. It
sounds like your code is not using any XML API, given that you have not
mentioned adding dependencies to libpq and that you've mentioned your
own hashtable algorithm. It would be much easier if you did so, though
I imagine the additional dependency would mean it would not be accepted
into libpq.

> <PGresult num_rows='1' num_cols='2'>
> <col_desc num='0' type='int4' format='text' name='foo' />
> <col_desc num='1' type='int4' format='text' name='bar' />
> <row num='0'>
> <col num='0'>1</col>
> <col num='1'>2</col>
> </row>
> </PGresult>

How would you filter for a column in XSLT based on column name with
this schema? It's certainly not trivial. I have similar code, and I
included the column name as an attribute in each column element for
this reason.

I also used the java.sql type names rather than PostgreSQL ones, as my
code is not specific to PostgreSQL.

> i would expect that integration would look something like exposing
> from libpq a function that looks something like:
> const char *PGresult_as_xml(PGresult *result, int include_dtd);

Ugh. So it returns the whole thing as one big string? That won't hold
up well if your resultset is large.

A better way would be to pump out SAX events. This is what I did for
three reasons:

1) The escaping becomes trivial, as mentioned above. In fact, not only
does SAX escape things correctly, but it makes you explicitly specify
that the string you're giving it is character data, an element name, an
attribute name, an attribute value, etc, and handles everything
properly based on that. So you'd really have to work to screw it up,
unlike code that just does like

printf("<elem foo='%s' bar='%s'>%s</elem>",
xml_attr_escape(foo_val), xml_attr_escape(bar_val),
xml_char_escape(elem_val));

where it would be quite easy to lose track of what needs to be escaped
how, what variables are already escaped, etc.

2) It can stream large result sets, provided that the next stage
supports doing so. Certainly a raw SAX serializer would, also some XSLT
stylesheets with Xalan, and STX/Joost is designed for streaming
transformations.

3) If the next stage is a transformation, this makes it unnecessary to
serialize and parse the data between. So the SAX way is faster.

You're welcome to take a look at my code. I imagine it will not be
directly useful to you, as it is written in Java, but I have a live
example which puts this stuff to use. Designing an acceptable API and
schema is always much easier when you see how it is put to use.

<http://www.slamb.org/projects/xmldb/> - my (so far poorly-named) xmldb
project, which includes the org.slamb.xmldb.ResultSetProducer class to
transform a java.sql.ResultSet to SAX events in my resultset schema.

<http://www.slamb.org/svn/repos/projects/xmldb/src/java/org/slamb/
xmldb/ResultSetProducer.java> - source code for said class

<http://www.slamb.org/projects/mb/> - a message board which uses this
code and some XSLT

<https://www.slamb.org/mb/> - a live example of said message board

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/
resultset.xsl> - simple XSLT to take an arbitrary resultset and convert
it to an HTML table

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/main.xsl> -
an example XSLT file that inherits this and then provides exceptions
for a couple columns (not displaying the id column, instead including
it as a hyperlink in the name column).

Good luck.

Scott Lamb

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2004-01-27 13:10:32 Re: LWLock/ShmemIndex startup question
Previous Message ohp 2004-01-27 12:33:44 Most urgent