Re: CopyReadLineText optimization revisited

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CopyReadLineText optimization revisited
Date: 2010-08-27 16:21:15
Message-ID: m2lj7s3u2s.fsf@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Ok. If we have to, we can keep that, it just requires more
> programming. After searching for a \n, we can peek at the previous byte to
> check if it's a backslash (and if it is, the one before that to see if it's
> a backslash too, and so forth until we find a non-backslash).

That's what pgloader does to allow for non-quoted fields containing
escaped separator in some contrived input formats (UNLOAD from Informix,
I'm looking at you).

I guess the same kind of playing could be applied to CSV too, but it'd
be necessary to search back to the previous \n and count the QUOTE chars
you find. Which does not sound like a huge win, even if you remember the
state at the last quoted \n.

Fancy format parsing ain't fun.

Regards,
--
dim

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-08-27 17:08:38 Re: pg_subtrans keeps bloating up in the standby
Previous Message Fujii Masao 2010-08-27 15:54:51 Re: pg_subtrans keeps bloating up in the standby