Re: proposal: psql statements \gstore \gstore_binary (instead COPY RAW)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: psql statements \gstore \gstore_binary (instead COPY RAW)
Date: 2016-12-10 06:11:26
Message-ID: CAFj8pRBzMtiyBLqWW+iiim+5zfc0VvAY0+17G_vPK_kwMztVSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2016-12-10 2:27 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 12/9/16 9:39 AM, Pavel Stehule wrote:
>
>>
>> SELECT image FROM accounts WHERE id = xxx
>> \gstore_binary ~/image.png
>>
>> What do you think about this proposal?
>>
>
> Seems reasonable.
>
> I've lost track at this point... is there a way to go the other direction
> with that as well? Namely, stick the contents of a file into a field via an
> INSERT or UPDATE?
>

a target of this feature is storing only. For import there should be
another statements.

I am think so there is a consensus (with Tom) on binary passing mode. Some
like

\set USE_BINARY on

What is not clean (where is not a agreement is a way how to get a some
content) - if we use a variables with content (not references), then we can
or cannot to have special statement

so some ways how to push some binary content to server

A)
\set xxxx `cat file`
\set USE_BINARY on
INSERT INTO tab(id, data) VALUES(1, :xxxx::bytea);

B)
\set xxxx `cat file`
INSERT INTO tab(id, data) VALUES (1, :x'xxxx'); -- use bytea escape

C)
\load_binary xxxx file
INSERT INTO tab(id, data) VALUES(1, :'xxxx');

D)
\load xxxx file
\set USE_BINARY on
INSERT INTO tab(id, data) VALUES(1, :xxxx::bytea);

E)
\set xxxx ``cat file``
INSERT INTO tab(id, data) VALUES (1, :'xxxx');

any from mentioned variants has some advantages - and I don't see a clean
winner. I like a binary mode for passing - the patch is small and clean and
possible errors are well readable (not a MBytes of hexa numbers). Passing
in text mode is safe - although the some errors, logs can be crazy. I would
to use some form of "load" backslash command ("load", "load_binary"): a) we
can implement a file tab complete, b) we can hide some platform specific
("cat" linux, "type" windows).

Now, only text variables are supported - it is enough for passing XML, JSON
- but not for binary data (one important variant is passing XML binary for
automatic XML internal encoding transformation). So we should to encode
content before storing to variable, or we should to introduce binary
variables. It is not hard - introduce new functions, current API will
supports text variables.

The implementation of these variants is short, simple - we can implement
more than exactly one way - @E is general, but little bit magic, and
without a autocomplete possibility, @C is very clear

The discussion can be about importance following features:

1. binary passing (important for XML, doesn't fill a logs, a speed is not
important in this context)
2. tab complete support
3. verbosity, readability

I would to know how these points are important, interesting for other
people? It can helps with choosing variant or variants that we can
implement. I don't expect some significant differences in implementation
complexity of mentioned variants - the code changes will be +/- same.

Regards

Pavel

>
> I've done that in the past via psql -v var=`cat file`, but there's
> obviously some significant drawbacks to that...
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Keith Fiske 2016-12-10 06:52:44 Re: [COMMITTERS] pgsql: Implement table partitioning.
Previous Message Karl O. Pinc 2016-12-10 05:36:12 Re: Patch to implement pg_current_logfile() function