Re: raw output from copy

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavel Golub <pavel(at)microolap(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: raw output from copy
Date: 2016-03-31 06:12:24
Message-ID: CAFj8pRCUa8QMKmqbfVmsE5zDyH9rdSVL0i=Hku9+nRUVqsYK8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-03-29 20:59 GMT+02:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > I am writing few lines as summary:
>
> > 1. invention RAW_TEXT and RAW_BINARY
> > 2. for RAW_BINARY: PQbinaryTuples() returns 1 and PQfformat() returns 1
> > 3.a for RAW_TEXT: PQbinaryTuples() returns 0 and PQfformat() returns 0,
> but
> > the client should to check PQcopyFormat() to not print "\n" on the end
> > 3.b for RAW_TEXT: PQbinaryTuples() returns 1 and PQfformat() returns 1,
> but
> > used output function, not necessary client modification
> > 4. PQcopyFormat() returns 0 for text, 1 for binary, 2 for RAW_TEXT, 3 for
> > RAW_BINARY
> > 5. create tests for ecpg
>
> 3.b certainly seems completely wrong. PQfformat==1 would imply binary
> data.
>
> I suggest that PQcopyFormat should be understood as defining the format
> of the copy data encapsulation, not the individual fields. So it would go
> like 0 = traditional text format, 1 = traditional binary format, 2 = raw
> (no encapsulation). You'd need to also look at PQfformat to distinguish
> raw text from raw binary. But if we do it as you suggest above, we've
> locked ourselves into only ever having two field format codes, which
> is something the existing design is specifically intended to allow
> expansion in.
>
>
I wrote concept of raw_text, raw_binary modes.

I am trying to implement text data passing like text format - but for
RAW_TEXT it is not practical. Text passing is designed for one line data,
for multiline data enforces escaping, what we don't would for RAW mode. I
have to skip escaping, and the code is not nice.

So I propose different schema - RAW_TEXT uses text values (uses
input/output functions), enforce encoding from/to client codes and for
passing to client mode is used binary mode - then I don't need to read the
content with line by line. PQbinaryTuples() returns 1 for RAW_TEXT and
RAW_BINARY - in these cases data are passed as one binary value. PQfformat
returns 2 for RAW_TEXT and 3 for RAW_BINARY.

Any objections to this design?

Regards

Pavel

> regards, tom lane
>

Attachment Content-Type Size
copy-raw-format-20160331-04.patch text/x-patch 41.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-03-31 06:34:51 Re: raw output from copy
Previous Message Rajkumar Raghuwanshi 2016-03-31 05:35:37 Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result