Skip site navigation (1) Skip section navigation (2)

Re: What is best way to stream terabytes of data into

From: "Frank Wosczyna" <frankw(at)greenplum(dot)com>
To: "Jeffrey Tenny" <jeffrey(dot)tenny(at)comcast(dot)net>,pgsql-performance(at)postgresql(dot)org
Subject: Re: What is best way to stream terabytes of data into
Date: 2005-07-21 18:26:04
Message-ID: BB05A27C22288540A3A3E8F3749B45AB0103578E@MI8NYCMAIL06.Mi8.com (view raw or flat)
Thread:
Lists: pgsql-performance
 


> Subject: [PERFORM] What is best way to stream terabytes of 
> data into postgresql?
> 
> Preferably via JDBC, but by C/C++ if necessary.
> 
> Streaming being the operative word.
> 
> Tips appreciated.
> 

Hi,

We granted our Java Loader to the Bizgres Open Source,
http://www.bizgres.org/assets/BZ_userguide.htm#50413574_pgfId-110126

You can load from STDIN instead of a file, as long as you prepend the
stream with the Loader Control file, for example:

for name in customer orders lineitem partsupp supplier part;do;cat
TPCH_load_100gb_${name}.ctl /mnt/<remote-host>/TPCH-Data/${name}.tbl.* |
loader.sh -h localhost -p 10001 -d tpch -t -u mpp; done

You can also run the loader from a remote host as well, with the "-h"
<host> being the target system with the Postgres database.

If you have terabytes of data, you might want to set a batch size (-b
switch) to commit occasionally.

Feel free to contact me directly if you have questions.

Thanks,

Frank

Frank Wosczyna
Systems Engineer
Greenplum / Bizgres MPP
www.greenplum.com


pgsql-performance by date

Next:From: Mark WongDate: 2005-07-21 21:55:07
Subject: Re: COPY FROM performance improvements
Previous:From: Greg StarkDate: 2005-07-21 16:40:17
Subject: Re: Optimizer seems to be way off, why?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group