Re: Importing Many XML Records

From: "Ron St-Pierre" <ronstp(at)mail(dot)com>
To: "George Pavlov" <gpavlov(at)mynewplace(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Importing Many XML Records
Date: 2006-01-28 01:54:00
Message-ID: 20060128015400.702391F50B1@ws1-2.us4.outblaze.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks George. I just returned from the bookstore and was looking at an XSLT solution in one of the books there. I want to import the data into the DB as regular data, not as XML. I'll look into Saxon and TagSoup as well as the perl module you mentioned. As far as this being outside the scope of the list, I wasn't sure whether or not there were postgres modules to deal with this.

Thanks for pointing me to possible solutions.

Ron

> ----- Original Message -----
> From: "George Pavlov" <gpavlov(at)mynewplace(dot)com>
> To: "Ron St-Pierre" <ronstp(at)mail(dot)com>
> Subject: Re: [GENERAL] Importing Many XML Records
> Date: Fri, 27 Jan 2006 16:03:20 -0800
>
>
> > I'm sure that this has been asked before but I can't find any
> > reference to it in google, and the search facility on
> > postgresql.org is currently down.
>
> http://groups.google.com/groups?q=group%3Apgsql.*
>
> provides the same with a slight delay but arguably a better user
> interface.
>
> > I have a large number of entries (possibly 10,000+) in an XML
> > file that I need to import into the database (7.4 on Debian) on a
> > daily basis. Does anyone have any recommendations concerning the
> > best way to do this? Is there some tool I should use or should I
> > create the code in java to parse and import the data?
> >
> > If anyone has done this before, I would appreciate hearing how they did this.
>
> This is generally outside the scope of this list. I am guessing (since I
> don't know much about your data format or goals), but you probably want
> to first transform the XML into a format suitable for importation into
> the database using COPY, or (much less desirable) a bunch of insert
> statements. In either case you should become familiar with XSLT
> processing and write yourself an XSLT template to do the job.
>
> I deal with a similar task using Saxon and TagSoup (which I highly
> recommend for XML that is not well-formatted) and create a CSV file out
> of a multitude of XML files (or a single XML file), which can then be
> COPY-ed into a PG table. Instead of a CSV file one could create a SQL
> script file of INSERT statements. I recommend Jeni Tennison's "Beginning
> XSLT" book as an excellent reference on the subject of XSLT.
>
> Depending on what your XML looks like you may get away without XSLT at
> all, but just preprocess it with awk, sed, perl (Template::Extract is a
> useful module) or whatever strikes your fancy.
>
> Other questions to answer are do you want the "records" to stay as XML
> in the database or do you want to import them into a regular table
> format? If the former you may want to get familiar with the pgxml (aka
> xml2 module) so you can query the XML data once inside your database.
>
> George
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

>

--
___________________________________________________
Play 100s of games for FREE! http://games.mail.com/

Browse pgsql-general by date

  From Date Subject
Next Message Matthew T. O'Connor 2006-01-28 04:49:12 Re: stats for failed transactions (was Re: [GENERAL] VACUUM
Previous Message Kris Jurka 2006-01-28 00:31:23 Re: Postgres 8.1.2, Java, JDO, and case sensitivity woes