Re: Better way to bulk-load millions of CSV records into

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Better way to bulk-load millions of CSV records into
Date: 2002-05-22 18:51:45
Message-ID: 1022093505.19121.48.camel@rebel
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Wed, 2002-05-22 at 13:11, Marc Spitzer wrote:
> On Wed, May 22, 2002 at 12:48:58PM -0500, Ron Johnson wrote:
> > On Wed, 2002-05-22 at 11:18, Marc Spitzer wrote:
> > > On Wed, May 22, 2002 at 09:19:31AM -0500, Tom Sheehan wrote:
[snip]
> for i in load_data/* ;do
> echo "datafile $i"
> awk -F, 'BEGIN{OFS=","}{if ($15~/[.]/){$15="-1"; $0=$0} print $0}' $i >$i.tmp
> mv $i.tmp $i
> grep -E "[0-9]+([.][0-9]+)+" $i
> grep -vE "[0-9]+([.][0-9]+)+" $i >$i.tmp
> mv $i.tmp $i
> echo "copy call_me_bob from '/home/marc/projects/bobs_house/$i' using Delimiters ',' with null $
> done
[snip]

I'm not an awk programmer. What does that command do?

Also, all my fields have double-quotes around them. Is there
a tool (or really clever use of sed) that will strip them
away from the fields that don't need them? I actually have
_comma_ delimited files, and any fields with commas in them
need the double quotes...

--
+---------------------------------------------------------+
| Ron Johnson, Jr. Home: ron(dot)l(dot)johnson(at)cox(dot)net |
| Jefferson, LA USA http://ronandheather.dhs.org:81 |
| |
| "I have created a government of whirled peas..." |
| Maharishi Mahesh Yogi, 12-May-2002, |
! CNN, Larry King Live |
+---------------------------------------------------------+

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message John Taylor 2002-05-22 21:05:01 Re: optimising data load
Previous Message Phillip J. Allen 2002-05-22 18:29:01 How to Identify a SERIAL column type?