Re: directory archive format for pg_dump

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, José Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: directory archive format for pg_dump
Date: 2010-12-02 07:49:45
Message-ID: 4CF74F99.4050008@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.12.2010 04:35, Joachim Wieland wrote:
> There is one thing however that I am not in favor of, which is the
> removal of the "sizeHint" parameter for the read functions. The reason
> for this parameter is not very clear now without LZF but I have tried
> to put in a few comments to explain the situation (which you have
> taken out as well :-) ).
>
> The point is that zlib is a stream based compression algorithm, you
> just stuff data in and from time to time you get data out and in the
> end you explicitly flush the compressor. The read function can just
> return as many bytes as it wants and we can just hand it all over to
> zlib. Other compression algorithms however are block based and first
> write a block header that contains the information on the next data
> block, including uncompressed and compressed sizes. Now with the
> sizeHint parameter I used, the compressor could tell the read function
> that it just wants to read the fixed size header (6 bytes IIRC). In
> the header it would look up the compressed size for the next block and
> would then ask the read function to get exactly this amount of data,
> decompress it and go on with the next block, and so forth...
>
> Of course you can possibly do that memory management inside the
> compressor with an extra buffer holding what you got in excess but
> it's a pain. If you removed that part on purpose on the grounds that
> there is no block based compression algorithm in core and probably
> never will be, then that's okay :-)

Yeah, we're not going to have lzf built-in anytime soon. The external
command approach seems like the best way to support additional
compression algorithms, and I don't think it could do anything with
sizeHint. And the custom format didn't obey sizeHint anyway, because it
reads one custom-format block at a time.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vaibhav Kaushal 2010-12-02 08:03:39 Re: Proposal: First step towards Intelligent, integrateddatabase
Previous Message ghatpande 2010-12-02 07:48:06 Re: Proposal: First step towards Intelligent, integrateddatabase