Re: [REVIEW] Re: Compression of full-page-writes

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Rahila Syed <rahilasyed90(at)gmail(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-11-10 08:26:51
Message-ID: CAB7nPqRFq__=QF0kgzO2kaJshSw8hDV2rw27cMGN=f8XBPUNTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 9, 2014 at 10:32 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Sun, Nov 9, 2014 at 6:41 AM, Rahila Syed <rahilasyed90(at)gmail(dot)com>
wrote:
>> Hello,
>>
>>>The patch was not applied to the master cleanly. Could you update the
>>> patch?
>> Please find attached updated and rebased patch to compress FPW. Review
>> comments given above have been implemented.
>
> Thanks for updating the patch! Will review it.
>
> BTW, I got the following compiler warnings.
>
> xlogreader.c:755: warning: assignment from incompatible pointer type
> autovacuum.c:1412: warning: implicit declaration of function
> 'CompressBackupBlocksPagesAlloc'
> xlogreader.c:755: warning: assignment from incompatible pointer type
I have been looking at this patch, here are some comments:
1) This documentation change is incorrect:
- <term><varname>full_page_writes</varname> (<type>boolean</type>)
+ <term><varname>full_page_writes</varname> (<type>enum</type>)</term>
<indexterm>
<primary><varname>full_page_writes</> configuration
parameter</primary>
</indexterm>
- </term>
The termination of block term was correctly places before.
2) This patch defines FullPageWritesStr and full_page_writes_str, but both
do more or less the same thing.
3) This patch is touching worker_spi.c and calling
CompressBackupBlocksPagesAlloc directly. Why is that necessary? Doesn't a
bgworker call InitXLOGAccess once it connects to a database?
4) Be careful as well of whitespaces (code lines should have as well a
maximum of 80 characters):
+ * If compression is set on replace the rdata nodes of backup blocks
added in the loop
+ * above by single rdata node that contains compressed backup blocks
and their headers
+ * except the header of first block which is used to store the
information about compression.
+ */
5) GetFullPageWriteGUC or something similar is necessary, but I think that
for consistency with doPageWrites its value should be fetched in XLogInsert
and then passed as an extra argument in XLogRecordAssemble. Thinking more
about this, I think that it would be cleaner to simply have a bool flag
tracking if compression is active or not, something like doPageCompression,
that could be fetched using GetFullPageWriteInfo. Thinking more about it,
we could directly track forcePageWrites and fullPageWrites, but that would
make back-patching more difficult with not that much gain.
6) Not really a complaint, but note that this patch is using two bits that
were unused up to now to store the compression status of a backup block.
This is actually safe as long as the maximum page is not higher than 32k,
which is the limit authorized by --with-blocksize btw. I think that this
deserves a comment at the top of the declaration of BkpBlock.
! unsigned hole_offset:15, /* number of bytes before "hole" */
! flags:2, /* state of a
backup block, see below */
! hole_length:15; /* number of bytes in
"hole" */
7) Some code in RestoreBackupBlock:
+ char *uncompressedPages;
+
+ uncompressedPages = (char *)palloc(XLR_TOTAL_BLCKSZ);
[...]
+ /* Check if blocks in WAL record are compressed */
+ if (bkpb.flag_compress == BKPBLOCKS_COMPRESSED)
+ {
+ /* Checks to see if decompression is successful is made
inside the function */
+ pglz_decompress((PGLZ_Header *) blk, uncompressedPages);
+ blk = uncompressedPages;
+ }
uncompressedPages is pallocd'd all the time but you actually just need to
do that when the block is compressed.
8) Arf, I don't like much the logic around CompressBackupBlocksPagesAlloc
using a malloc to allocate once the space necessary for compressed and
uncompressed pages. You are right to not do that inside a critical section,
but PG tries to maximize the allocations to be palloc'd. Now it is true
that if a palloc does not succeed, PG always ERROR's out (writer adding
entry to TODO list)... Hence I think that using a static variable for those
compressed and uncompressed pages makes more sense, and this simplifies
greatly the patch as well.
9) Is avw_sighup_handler really necessary, what's wrong in allocating it
all the time by default? This avoids some potential caveats in error
handling as well as in value updates for full_page_writes.

So, note that I am not only complaining about the patch, I actually rewrote
it as attached while reviewing, with additional minor cleanups and
enhancements. I did as well a couple of tests like the script attached,
compression numbers being more or less the same as your previous patch,
some noise creating differences. I have done also some regression test runs
with a standby replaying behind.

I'll go through the patch once again a bit later, but feel free to comment.
Regards,
--
Michael

Attachment Content-Type Size
compress_test_2.sh application/x-sh 759 bytes
20141110_fpw_compression_v5.patch application/x-patch 30.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message furuyao 2014-11-10 10:19:30 Re: pg_receivexlog --status-interval add fsync feedback
Previous Message Amit Kapila 2014-11-10 07:37:25 Re: [v9.5] Custom Plan API