Skip site navigation (1) Skip section navigation (2)

Re: NOLOGGING option, or ?

From: "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
To: "Steve Atkins" <steve(at)blighty(dot)com>, pgsql-hackers(at)postgresql(dot)org,"Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 04:30:01
Message-ID: BB05A27C22288540A3A3E8F3749B45AB15E162@MI8NYCMAIL06.Mi8.com (view raw or flat)
Thread:
Lists: pgsql-hackers
>I've been following this thread, and I'm a little confused. Could you
possibly clarify what you mean, by providing a couple of lines of
input as it would be formatted with escape processing turned off - 
containing a text field with an embedded newline and tab and a null field.


yeah, this is all a bit confusing, but I *hope* I can clarify things here as I think I got a better understanding now.

before that, let me just clarify that the performance improvements in the patch have nothing to do with the escaping mechanizm. Escapes could change. The performance gain in due to a buffered processing with minimal line/attribute buffer loads.

I think that the basic issue is that there are some database users that would like to take their data and put it into the database without pre-processing it - regardless if there are any backslashes in it or 0x0D (CR's) etc... these are the users I am targeting in my patch as these are the users I ran into in the field. The only responsibility of these users is to explicitly escape any delimiter or 0x0A (LF) characters that they intend to have as DATA. that's all. 

On the other hand there are users that would like to pre-process their data with C-escape sequences (or alternatevly, users that already have their data escaped) - this is what the postgres COPY targets these days.

2 different ways to do it... none of them is right or wrong.

Examples:

users that my patch targets may have a data row as such (delim = '|', EOL = [LF]):

    c:\one\two|d:\ten\nine[LF]

using the way i do escaping in my patch those 2 fields of data will end up in the DB as
Field 1: c:\one\two
Field 2: d:\ten\nine

which is what the user would want. If they wanted to have a pipe char in the second field they could escape it as such: d:\ten\nine here is a pipe \| [LF] and no error will occur, and result will be:

Field 2: d:\ten\nine here is a pipe |

If you try to insert that first data line above using the existing COPY command you will get an undesired result:

Field 1: c:one   wo
Field 2: d:    
         ine


Now, the other way around, users that do intend for their data to have escape sequences in it may have a line like this:

    that's a \t tab| and this is a \nline feed [LF]

and will get the desired result of:

Field 1: that's a       tab
Field 2: and this is a 
         line feed

while using my code they will get undesired results:
Field 1: that's a \t tab
Field 2: and this is a \nline feed


so, basically it really depends on the target audience...

Bruce, does that sounds right to you?
Alon.





-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org on behalf of Steve Atkins
Sent: Wed 6/1/2005 10:47 PM
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] NOLOGGING option, or ?
 
On Wed, Jun 01, 2005 at 07:35:33PM -0700, Luke Lonergan wrote:
> >> I propose an extended syntax to COPY with a change in semantics to remove
> >> the default of "WITH ESCAPE '\'".
> > 
> > Er, doesn't this break existing database dumps?
> 
> Yes, one of the previously stated reasons to create another command for
> loading data.
> 
> Another possible approach is to keep the default, but allow the escape
> processing to be turned off.

I've been following this thread, and I'm a little confused. Could you
possibly clarify what you mean, by providing a couple of lines of
input as it would be formatted with escape processing turned off - 
containing a text field with an embedded newline and tab and a null field.

Cheers,
  Steve

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster






--------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster




Responses

pgsql-hackers by date

Next:From: Luke LonerganDate: 2005-06-02 05:16:34
Subject: Re: NOLOGGING option, or ?
Previous:From: Bruce MomjianDate: 2005-06-02 03:54:08
Subject: Re: Backslash handling in strings

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group