Skip site navigation (1) Skip section navigation (2)

Re: multiline CSV fields

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: multiline CSV fields
Date: 2004-11-30 19:34:06
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-patches
Andrew Dunstan wrote:
> Bruce Momjian wrote:
> >I am wondering if one good solution would be to pre-process the input
> >stream in copy.c to convert newline to \n and carriage return to \r and
> >double data backslashes and tell copy.c to interpret those like it does
> >for normal text COPY files.  That way, the changes to copy.c might be
> >minimal; basically, place a filter in front of the CSV file that cleans
> >up the input so it can be more easily processed.
> >  
> >
> This would have to parse the input stream, because you would need to 
> know which CRs and LFs were part of the data stream and so should be 
> escaped, and which really ended data lines and so should be left alone. 
> However, while the idea is basically sound, parsing the stream twice 
> seems crazy. My argument has been that at this stage in the dev cycle we 
> should document the limitation, maybe issue a warning as you want, and 
> make the more invasive code changes to fix it properly in 8.1. If you 

OK, right.

> don't want to wait, then following your train of thought a bit, ISTM 
> that the correct solution is a routine for CSV mode that combines the 
> functions of CopyReadAttributeCSV() and CopyReadLine(). Then we'd have a 
> genuine and fast fix for Greg's and Darcy's problem.

We are fine for 8.0, except for the warning, and you think we can fix it
perfectly in 8.1, good.

  Bruce Momjian                        |
  pgman(at)candle(dot)pha(dot)pa(dot)us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2004-11-30 20:33:23
Subject: Re: Error handling in plperl and pltcl
Previous:From: Bruce MomjianDate: 2004-11-30 19:32:05
Subject: Re: Increasing the length of

pgsql-patches by date

Next:From: Devrim GUNDUZDate: 2004-11-30 23:56:49
Subject: Updated Turkish Translations for PostgreSQL 8.0
Previous:From: Andrew DunstanDate: 2004-11-30 19:11:28
Subject: Re: multiline CSV fields

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group