Re: raw output from copy

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Verite <daniel(at)manitou-mail(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavel Golub <pavel(at)microolap(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: raw output from copy
Date: 2016-04-08 18:54:42
Message-ID: 5707FE72.3050907@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/08/2016 02:13 PM, Robert Haas wrote:
> On Tue, Apr 5, 2016 at 4:45 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> here is cleaned/finished previous implementation of RAW_TEXT/RAW_BINARY
>> formats for COPY statements.
>>
>> The RAW with text formats means unescaped data, but with correct encoding -
>> input/output is realised with input/output function. RAW binary means
>> content produced/received by sending/received functions.
>>
>> Now both directions (input/output) working well
>>
>> Some examples of expected usage:
>>
>> copy (select xmlelement(name foo, 'hello')) to stdout (format raw_binary,
>> encoding 'latin2');
>>
>> create table avatars(id serial, picture bytea);
>> \copy avatars(picture) from ~/images/foo.jpg (format raw_binary);
>> select lastval();
>>
>> create table doc(id serial, txt text);
>> \copy doc(txt) from ~/files/aaa.txt (format raw_text, encoding 'latin2');
>> select lastval();
> As much as I know you and some other people would like it to be
> otherwise, this patch clearly does not have a sufficient degree of
> consensus to justify committing it to PostgreSQL 9.6. I'm marking it
> Returned with Feedback.
>

I should add that I've been thinking about this some more, and that I
now agree that something should be done to support this at the SQL
level, mainly so that clients can manage very large pieces of data in a
stream-oriented fashion rather than having to marshall the data in
memory to load/unload via INSERT/SELECT. Anything that is client-side
only is likely to have this memory issue.

At the same time I'm still not entirely convinced that COPY is a good
vehicle for this. It's designed for bulk records, and already quite
complex. Maybe we need something new that uses the COPY protocol but is
more specifically tailored for loading or sending large singleton pieces
of data.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2016-04-08 18:55:15 Re: Speedup twophase transactions
Previous Message Pavel Stehule 2016-04-08 18:54:33 Re: proposal: PL/Pythonu - function ereport