Re: [PATCH] COPY .. COMPRESSED

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] COPY .. COMPRESSED
Date: 2013-01-14 13:43:11
Message-ID: 20130114134311.GH16126@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > Attached is a patch to add a 'COMPRESSED' option to COPY which will
> > cause COPY to expect a gzip'd file on input and which will output a
> > gzip'd file on output. Included is support for backend COPY, psql's
> > \copy, regression tests for both, and documentation.
>
> I don't think it's a very good idea to invent such a specialized option,
> nor to tie it to gzip, which is widely considered to be old news.

We're already using gzip/zlib for pg_dump/pg_restore, so it was simple
and straight-forward to add and would allow utilizing this option while
keeping the custom dump format the same. It also happens to match what
I need. While gzip might be 'old hat' it's still extremely popular.
I'd be happy to add support for bzip2 or something else that people are
interested in, and support compression options for zlib if necessary
too. This was intended to get the ball rolling on something as the last
discussion that I had seen while hunting through the archives was from
2006, obviously I missed the boat on the last set of patches.

> There was discussion (and, I think, a patch in the queue) for allowing
> COPY to pipe into or out of an arbitrary shell pipe. Why would that not
> be enough to cover this use-case? That is, instead of a hard-wired
> capability, people would do something like COPY TO '| gzip >file.gz'.
> Or they could use bzip2 or whatever struck their fancy.

Sounds like a nice idea, but I can't imagine it'd be available to anyone
except for superusers, and looking at that patch, that's exactly the
restriction which is in place for it. In addition, that patch's support
for "\copy" implements everything locally, making it little different
from "zcat mycsv.csv.gz | psql". The patch that I proposed actually
sent the compressed stream across the wire, reducing bandwidth
utilization.

All that said, I've nothing against having the pipe option for the
backend COPY command; a bit annoyed with myself for somehow missing that
patch. I don't like what it's doing with psql's \copy command and would
rather we figure out a way to support PROGRAM .. TO STDOUT, but that
still would require superuser privileges. I don't see any easy way to
support compressed data streaming to/from the server for COPY w/o
defining what methods are available or coming up with some ACL system
for what programs can be called by the backend.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-01-14 14:23:31 Re: [PATCH] COPY .. COMPRESSED
Previous Message Marko Kreen 2013-01-14 13:42:42 Re: pgcrypto seeding problem when ssl=on