Re: large xml database

From: Mike Christensen <mike(at)kitchenpc(dot)com>
To: Viktor Bojović <viktor(dot)bojovic(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: large xml database
Date: 2010-10-30 22:04:53
Message-ID: AANLkTikXgZacvH=-AFx3NMgNQwxU-R2GNjq=U__v_A1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Geeeeeeeez.

Maybe you can lease a bunch of Amazon EC2 high computing slices and
parallelize it? I think throwing ridiculous amounts of hardware at
things is always the best approach.

On Sat, Oct 30, 2010 at 2:48 PM, Viktor Bojović
<viktor(dot)bojovic(at)gmail(dot)com> wrote:
> Hi,
> i have very big XML documment which is larger than 50GB and want to import
> it into databse, and transform it to relational schema.
> When splitting this documment to smaller independent xml documments i get
> ~11.1mil XML documents.
> I have spent lots of time trying to get fastest way to transform all this
> data but every time i give up because it takes too much time. Sometimes more
> than month it would take if not stopped.
> I have tried to insert each line as varchar into database and parse it using
> plperl regex..
> also i have tried to store every documment  as XML and parse it, but it is
> also to slow.
> i have tried to store every documment as varchar but it is also slow when
> using regex to get data.
> many tries have failed because 8GB of ram and 10gb of swap were not enough.
> also sometimes i get that more than 2^32 operations  were performed, and
> functions stopped to work.
> i wanted just to ask if someone knows how to speed this up.
>
> thanx in advance
> --
> ---------------------------------------
> Viktor Bojović
> ---------------------------------------
> Wherever I go, Murphy goes with me
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message bricklen 2010-10-31 00:25:53 Re: Can Postgres Not Do This Safely ?!?
Previous Message Viktor Bojović 2010-10-30 21:48:27 large xml database