Re: Some CSV file-import questions

From: Andrew McMillan <andrew(at)catalyst(dot)net(dot)nz>
To: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
Cc: PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Some CSV file-import questions
Date: 2002-05-19 11:13:39
Message-ID: 1021806819.25859.20.camel@kant.mcmillan.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Sun, 2002-05-19 at 23:01, Ron Johnson wrote:
>
> If the csv file generated by the application that I am
> importing from contains quotes around each field, must I
> write a program to strip these "field-level" quotes before
> sending the file to COPY?
>
> As we know, COPY is a single transaction. Therefore, it
> would be "unpleasant" if, say, the process that is doing the
> importing dies 90% of the way through a 10,000,000 row table.
> Is there a checkpoint mechanism, that, would do a COMMIT, for
> example, every 10,000 rows. Then, if the process that is doing
> the importing does 90% of the way through that 10,000,000 row
> table, when you restart the COPY, it skips over the inserted
> rows.
> Here is an example from the RDBMS that I currently use:
> $ bulkload -load -log -commit=10000 -tran=exclusive -db=test \
> -table=foo ~/foo.csv
> Then, if something happens after inserting 9,000,000 rows,
> it can be restarted by:
> $ bulkload -load -log -commit=10000 -skip=9000000 -db=test \
> -tran=exclusive -table=foo ~/foo.csv
>
> >From what I've seen in the documentation, and the mailing
> list archives, the solution to both of these questions is
> to roll my bulk loader.

Yes, or to borrow one someone else has already done.

I have a perl script I use for this sort of thing, and although it
handles the full possibilities of quoting fields, it only loads the
whole file as a single transaction, or as one transaction per line.

You are welcome to it if you wish. It shouldn't be hard to extend it to
allow groups of transactions to be checkpointed - I will probably even
do it myself before the end of the year.

Regards,
Andrew.
--
--------------------------------------------------------------------
Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington
WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267
Are you enrolled at http://schoolreunions.co.nz/ yet?

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Susan Evans 2002-05-19 14:04:14 SSL Error
Previous Message Ron Johnson 2002-05-19 11:01:26 Some CSV file-import questions