Re: directory archive format for pg_dump

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, José Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: directory archive format for pg_dump
Date: 2010-12-01 14:03:15
Message-ID: 4CF655A3.4090308@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29.11.2010 22:21, Heikki Linnakangas wrote:
> On 29.11.2010 07:11, Joachim Wieland wrote:
>> On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> * wrap long lines
>>> * use extern in function prototypes in header files
>>> * "inline" some functions like _StartDataCompressor, _EndDataCompressor,
>>> _DoInflate/_DoDeflate that aren't doing anything but call some other
>>> function.
>>
>> So here is a new round of patches. It turned out that the feature to
>> allow to also restore files from a different dump and with a different
>> compression required some changes in the compressor API. And in the
>> end I didn't like all the #ifdefs either and made a less #ifdef-rich
>> version using function pointers.
>
> Ok. The separate InitCompressorState() and AllocateCompressorState()
> functions seem unnecessary. As the code stands, there's little
> performance gain from re-using the same CompressorState, just
> re-initializing it, and I can't see any other justification for them
> either.
>
> I combined those, and the Free/Flush steps, and did a bunch of other
> editorializations and cleanups. Here's an updated patch, also available
> in my git repository at
> git://git.postgresql.org/git/users/heikki/postgres.git, branch
> "pg_dump-dir". I'm going to continue reviewing this later, tomorrow
> hopefully.

Here's another update. I changed things quite heavily. I didn't see the
point of having the Alloc+Free functions for uncompressing, because the
ReadDataFromArchive processed the whole input stream in one go anyway.
So the new API consists of four functions, AllocateCompressor,
WriteDataToArchive and EndCompressor for writing, and
ReadDataFromArchive for reading.

Also, I reverted the zlib buffer size from 64k to 4k. If you want to
raise that, let's discuss that separately.

Please let me know what you think of this version, or if you spot any
bugs. I'll keep working on this, I'm hoping to get this into committable
shape by the end of the week.

The pg_backup_directory patch naturally won't apply over this anymore.
Once we have the compress_io part in shape, that will need to be fixed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-12-01 14:05:34 Re: directory archive format for pg_dump
Previous Message Andrew Dunstan 2010-12-01 14:00:13 Re: We really ought to do something about O_DIRECT and data=journalled on ext4