From: | Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net> |
---|---|
To: | PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org> |
Subject: | Re: Better way to bulk-load millions of CSV records into |
Date: | 2002-05-22 18:51:45 |
Message-ID: | 1022093505.19121.48.camel@rebel |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
On Wed, 2002-05-22 at 13:11, Marc Spitzer wrote:
> On Wed, May 22, 2002 at 12:48:58PM -0500, Ron Johnson wrote:
> > On Wed, 2002-05-22 at 11:18, Marc Spitzer wrote:
> > > On Wed, May 22, 2002 at 09:19:31AM -0500, Tom Sheehan wrote:
[snip]
> for i in load_data/* ;do
> echo "datafile $i"
> awk -F, 'BEGIN{OFS=","}{if ($15~/[.]/){$15="-1"; $0=$0} print $0}' $i >$i.tmp
> mv $i.tmp $i
> grep -E "[0-9]+([.][0-9]+)+" $i
> grep -vE "[0-9]+([.][0-9]+)+" $i >$i.tmp
> mv $i.tmp $i
> echo "copy call_me_bob from '/home/marc/projects/bobs_house/$i' using Delimiters ',' with null $
> done
[snip]
I'm not an awk programmer. What does that command do?
Also, all my fields have double-quotes around them. Is there
a tool (or really clever use of sed) that will strip them
away from the fields that don't need them? I actually have
_comma_ delimited files, and any fields with commas in them
need the double quotes...
--
+---------------------------------------------------------+
| Ron Johnson, Jr. Home: ron(dot)l(dot)johnson(at)cox(dot)net |
| Jefferson, LA USA http://ronandheather.dhs.org:81 |
| |
| "I have created a government of whirled peas..." |
| Maharishi Mahesh Yogi, 12-May-2002, |
! CNN, Larry King Live |
+---------------------------------------------------------+
From | Date | Subject | |
---|---|---|---|
Next Message | John Taylor | 2002-05-22 21:05:01 | Re: optimising data load |
Previous Message | Phillip J. Allen | 2002-05-22 18:29:01 | How to Identify a SERIAL column type? |