Re: [PATCH] COPY .. COMPRESSED

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] COPY .. COMPRESSED
Date: 2013-01-15 15:55:04
Message-ID: 20130115155504.GQ16126@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Peter Eisentraut (peter_e(at)gmx(dot)net) wrote:
> Operating on compressed files transparently in file_fdw is obviously
> useful, but why only gzip?

This isn't really an argument, imv. It's only gzip *right this moment*
because that's all that I implemented. I've already offered to add
bzip2 or whatever else people would like.

> The gold standard is GNU tar, which can
> operate on any compressed file in a variety of compression formats
> without even having to specify an option.

Yes, that's what I was hoping to get to, eventually.

> Writing compressed COPY output files on the backend has limited uses, at
> least none have been clearly explained, and the popen patch might
> address those better.

I do see value in the popen patch for server-side operations.

> Writing compressed COPY output on the frontend can already be done
> differently.

Certainly. On a similar vein, I'm not convinced that the popen patch
for psql's \copy is really a great addition.

> Compression on the wire is a different debate and it probably shouldn't
> be snuck in through this backdoor.

Considering the COPY-COMPRESSED-to-FE piece is the vast majority of the
patch, I hope you understand that it certainly wasn't my intent to try
and 'sneak it in'. Support for reading and writing compressed files
with COPY directly from the FE was one of my goals from the start on
this.

> Putting compressed COPY output from the backend straight into a
> compressed pg_dump file sounds interested, but this patch doesn't do
> that yet, and I think there will be more issues to solve there.

Let me just vent my dislike for the pg_dump code. :) Probably half the
time spent on this overall patch was fighting with that to make it work
and it's actually about 90% of the way there, imv. Getting the
compressed data into pg_dump is working in my local branch, going to a
directory-format dump output, but the custom format is causing me some
difficulties which I believe are related to the blocking that's used and
that the blocks coming off the wire were 'full-size', if you will,
instead of being chunked down to 4KB by the client-side compression.
I've simply not had time to debug it and fix it and wanted to get the
general patch out for discussion (which I'm glad that I did, given that
there's other work going on that's related).

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2013-01-15 15:55:41 Re: pg_ctl idempotent option
Previous Message Peter Eisentraut 2013-01-15 15:51:53 Re: pg_ctl idempotent option