Re: raw output from copy

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Pavel Golub <pavel(at)microolap(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hlinnaka <hlinnaka(at)iki(dot)fi>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: raw output from copy
Date: 2016-04-12 20:19:51
Message-ID: CAFj8pRCwSi4tQiMYKkHPEMwFF6ZTOcy4gayh+4BQZPDZVxkbOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2016-04-12 12:22 GMT+02:00 Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>:

> On 8 Apr 2016 9:14 pm, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> wrote:
> > 2016-04-08 20:54 GMT+02:00 Andrew Dunstan <andrew(at)dunslane(dot)net>:
> >> I should add that I've been thinking about this some more, and that I
> now agree that something should be done to support this at the SQL level,
> mainly so that clients can manage very large pieces of data in a
> stream-oriented fashion rather than having to marshall the data in memory
> to load/unload via INSERT/SELECT. Anything that is client-side only is
> likely to have this memory issue.
> >>
> >> At the same time I'm still not entirely convinced that COPY is a good
> vehicle for this. It's designed for bulk records, and already quite
> complex. Maybe we need something new that uses the COPY protocol but is
> more specifically tailored for loading or sending large singleton pieces of
> data.
> >
> >
> > Now it is little bit more time to think more about. But It is hard to
> design some more simpler than is COPY syntax. What will support both
> directions.
>
> Sorry for arriving late and adding to the bikeshedding. Maybe the
> answer is to make COPY pluggable. It seems to me that it would be
> relatively straightforward to add an extension mechanism for copy
> output and input plugins that could support any format expressible as
> a binary stream. Raw output would then be an almost trivial plugin.
> Others could implement JSON, protocol buffers, Redis bulk load, BSON,
> ASN.1 or whatever else serialisation format du jour. It will still
> have the same backwards compatibility issues as adding the raw output,
> but the payoff is greater.
>

I had a idea about additional options of COPY RAW statements. One can be
CAST function. These CAST functions can be used to any for any format.

COPY has two parts - client, and server side. Currently we cannot to expand
libpq, and we cannot to expand psql. So we have to send data to client in
target format and all transformations should be done on server side.
Personally, I strongly prefer to write Linux server side extensions against
MSWin client side extensions. The client (psql) is able to use a pipe - so
any client side transformation can be done outer psql.

Regards

Pavel

>
> Regards,
> Ants Aasma
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-04-12 20:20:10 Re: Problems with huge_pages and IBM Power8
Previous Message Kevin Grittner 2016-04-12 20:14:07 Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <