Re: raw output from copy

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavel Golub <pavel(at)microolap(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: raw output from copy
Date: 2016-03-29 15:44:09
Message-ID: CAFj8pRCkK7VYdadRaN6utNcAWLMQEnZtSQFbzGUqwmWTd1H54g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-03-29 0:26 GMT+02:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > [ copy-raw-format-20160227-03.patch ]
>
> I looked at this patch. I'm having a hard time accepting that it has
> a use-case large enough to justify it, and here's the reason: it's
> a protocol break. Conveniently omitting to update protocol.sgml
> doesn't make it not a protocol break. (libpq.sgml also contains
> assorted statements that are falsified by this patch.)
>
> You could argue that it's the user's own fault if he tries to use
> COPY RAW with client-side code that hasn't been updated to support it.
> Maybe that's okay, but I wonder if we're opening ourselves up to
> problems. Maybe even security-grade problems.
>

I tested COPY RAW on old psql clients - and it is working without any
problem - so when the client uses same logic as psql, then it should to
work. Sure, there can be differently implemented clients, but the COPY
client side is usually simple - store stream to output.

Maybe I am blind, but I don't see any new security risks. The risk can be
only on client side - and if client is not able work with new value, then
it can fails. But any attacker can use fake data stream, and can enforce
this error too. So if there are some security risks on special designed
clients, then this risks is existing now.

> In terms of specific code that hasn't been updated, ecpg is broken
> by this patch, and I'm not very sure what libpq's PQbinaryTuples()
> ought to do but probably something other than what it does today.
>
> There's also a definitional question of what we think PQfformat() ought
> to do; should it return "2" for the per-field format? Or maybe the
> per-field format is still "1", since it's after all the same binary data
> format as for COPY BINARY, and only the overall copy format reported by
> PQbinaryTuples() should change to "2".
>

Theoretically the change there is allowed - "Format code zero indicates
textual data representation, while format code one indicates binary
representation. (Other codes are reserved for future definition.) -
PQfformat". But - the format of COPY RAW is binary - this format is cleaner
binary format than is used by COPY BINARY (where is a header + BINARY). I
am thinking so PQbinaryTuples should to return 1 (without change), and
PQfformat should to return 2. If some older client uses deprecated function
PQbinaryTuples(), then 1 is safe value. PQfformat() is documented
differently and if there will be different than expected value, then the
client should to raise a error. So using 2 is safe there. The value 2 is
adequate to actual content

Packet: t=1459265078.596466, session=213070643360702
PGSQL: type=Query, F -> B
QUERY query=copy foo(x) to stdout (format raw);

Packet: t=1459265078.597755, session=213070643360702
PGSQL: type=CopyOutResponse, B -> F
COPY OUT RESPONSE copy format=1, num_fields=1, fields_formats=2

Packet: t=1459265078.597755, session=213070643360702
PGSQL: type=CopyData, B -> F
COPY DATA len=20

Packet: t=1459265078.597755, session=213070643360702
PGSQL: type=CopyDone, B -> F
COPY DONE

Packet: t=1459265078.597755, session=213070643360702
PGSQL: type=CommandComplete, B -> F
COMMAND COMPLETE command='COPY 1'

Packet: t=1459265078.597755, session=213070643360702
PGSQL: type=ReadyForQuery, B -> F
READY FOR QUERY type=<IDLE>

What do you think ?

p.s. These values are returned now

PQfformat(*results, 0)) returns 2 already, PQbinaryTuples() returns 1.

> BTW, I'm not really sure why the patch is trying to enforce single
> row and column for the COPY OUT case. I thought the idea for that
> was that we'd just shove out the data without any delimiters, and
> if it's more than one datum it's the user's problem whether he can
> identify the boundaries. On the input side we would have to insist
> on one column since we're not going to attempt to identify boundaries
> (and one row would fall out of the fact that we slurp the entire input
> and treat it as one datum).
>
> Anyway this is certainly not committable as-is, so I'm setting it back
> to Waiting on Author. But the fact that both libpq and ecpg would need
> updates makes me question whether we can safely pretend that this isn't
> a protocol break.
>

I executed all tests in libpq and ecpg without any problems. Can you,
please, help me with repeating a ecpg issues?

Regards

Pavel

>
> regards, tom lane
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-03-29 15:45:42 Re: VS 2015 support in src/tools/msvc
Previous Message David Steele 2016-03-29 15:43:18 Re: 2016-03 Commitfest