Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY

From: Craig Ringer <craig(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY
Date: 2012-11-15 03:14:13
Message-ID: 50A45E05.3030201@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/15/2012 10:19 AM, Tom Lane wrote:
>
> I disagree very very strongly with that. If we prevent use of shell
> syntax, we will lose a lot of functionality, for instance
>
> copy ... from program 'foo <somefile'
> copy ... from program 'foo | bar'
>
> unless you're imagining that we will reimplement a whole lot of that
> same shell syntax for ourselves. (And no, I don't care whether the
> Windows syntax is exactly the same or not. The program name/path is
> already likely to vary across systems, so it's pointless to suppose that
> use of the feature would be 100% portable if only we lobotomized it.)

That's reasonable - and it isn't worth making people jump through hoops
with ('bash','-c','/some/command < infile') .

> So? You're already handing the keys to the kingdom to anybody who can
> control the contents of that command line, even if it's only to point at
> the wrong program. And one man's "unexpected side-effect" is another
> man's "essential feature", as in my examples above.

That's true if they're controlling the whole command, not so much if
they just provide a file name. I'm just worried that people will use it
without thinking deeply about the consequences, just like they do with
untrusted client input in SQL injection attacks.

I take you point about wanting more than just the execve()-style
invocation. I'd still like to see a way to invoke the command without
having the shell involved, though; APIs to invoke external programs seem
to start out with a version that launches via the shell then quickly
grow more controlled argument-vector versions.

There's certainly room for a quick'n'easy COPY ... FROM PROGRAM ('cmd1 |
cmd2 | tee /tmp/log') . At this point all I think is really vital is to
make copy-with-exec *syntactically different* to plain COPY, and to
leave room for extending the syntax for environment, separate args
vector, etc when they're called for. Like VACUUM, where VACUUM VERBOSE
ANALYZE became VACUUM (VERBOSE, ANALYZE) to make room for (BUFFERS), etc.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-11-15 03:44:18 Re: recursive view syntax
Previous Message Kevin Grittner 2012-11-15 02:28:19 Materialized views WIP patch