Re: [REVIEW] Re: Compression of full-page-writes

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-08 06:48:52
Message-ID: CAB7nPqQ9t97kg99v1nEJETd=-=oXNcnq9A4iP1uoez+PLSw4Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 8, 2014 at 3:42 PM, Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
>
>>The important point to consider for this patch is the use of the additional
>> 2-bytes as uint16 in the block information structure to save the length of a
>> compressed
>>block, which may be compressed without its hole to achieve a double level
>> of compression (image compressed without its hole). We may use a simple flag
>> on
>>one or two bits using for example a bit from hole_length, but in this case
>> we would need to always compress images with their hole included, something
>> more
> >expensive as the compression would take more time.
> As you have mentioned here the idea to use bits from existing fields rather
> than adding additional 2 bytes in header,
> FWIW elaborating slightly on the way it was done in the initial patches,
> We can use the following struct
>
> unsigned hole_offset:15,
> compress_flag:2,
> hole_length:15;
>
> Here compress_flag can be 0 or 1 depending on status of compression. We can
> reduce the compress_flag to just 1 bit flag.
Just adding that this is fine as the largest page size that can be set is 32k.

> IIUC, the purpose of adding compress_len field in the latest patch is to
> store length of compressed blocks which is used at the time of decoding the
> blocks.
>
> With this approach, length of compressed block can be stored in hole_length
> as,
>
> hole_length = BLCKSZ - compress_len.
>
> Thus, hole_length can serve the purpose of storing length of a compressed
> block without the need of additional 2-bytes. In DecodeXLogRecord,
> hole_length can be used for tracking the length of data received in cases of
> both compressed as well as uncompressed blocks.
>
> As you already mentioned, this will need compressing images with hole but
> we can MemSet hole to 0 in order to make compression of hole less expensive
> and effective.

Thanks for coming back to this point in more details, this is very
important. The additional 2 bytes used make compression less expensive
by ignoring the hole, for a bit more data in each record. Using uint16
is as well a cleaner code style, more in-line wit hte other fields,
but that's a personal opinion ;)

Doing a switch from one approach to the other is easy enough though,
so let's see what others think.
Regards,
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2014-12-08 07:21:59 Re: On partitioning
Previous Message Rahila Syed 2014-12-08 06:42:30 Re: [REVIEW] Re: Compression of full-page-writes