Re: [REVIEW] Re: Compression of full-page-writes

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: "Syed, Rahila" <Rahila(dot)Syed(at)nttdata(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2015-02-16 11:55:36
Message-ID: 20150216115536.GG20205@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-16 11:30:20 +0000, Syed, Rahila wrote:
> - * As a trivial form of data compression, the XLOG code is aware that
> - * PG data pages usually contain an unused "hole" in the middle, which
> - * contains only zero bytes. If hole_length > 0 then we have removed
> - * such a "hole" from the stored data (and it's not counted in the
> - * XLOG record's CRC, either). Hence, the amount of block data actually
> - * present is BLCKSZ - hole_length bytes.
> + * Block images are able to do several types of compression:
> + * - When wal_compression is off, as a trivial form of compression, the
> + * XLOG code is aware that PG data pages usually contain an unused "hole"
> + * in the middle, which contains only zero bytes. If length < BLCKSZ
> + * then we have removed such a "hole" from the stored data (and it is
> + * not counted in the XLOG record's CRC, either). Hence, the amount
> + * of block data actually present is "length" bytes. The hole "offset"
> + * on page is defined using "hole_offset".
> + * - When wal_compression is on, block images are compressed using a
> + * compression algorithm without their hole to improve compression
> + * process of the page. "length" corresponds in this case to the length
> + * of the compressed block. "hole_offset" is the hole offset of the page,
> + * and the length of the uncompressed block is defined by "raw_length",
> + * whose data is included in the record only when compression is enabled
> + * and "with_hole" is set to true, see below.
> + *
> + * "is_compressed" is used to identify if a given block image is compressed
> + * or not. Maximum page size allowed on the system being 32k, the hole
> + * offset cannot be more than 15-bit long so the last free bit is used to
> + * store the compression state of block image. If the maximum page size
> + * allowed is increased to a value higher than that, we should consider
> + * increasing this structure size as well, but this would increase the
> + * length of block header in WAL records with alignment.
> + *
> + * "with_hole" is used to identify the presence of a hole in a block image.
> + * As the length of a block cannot be more than 15-bit long, the extra bit in
> + * the length field is used for this identification purpose. If the block image
> + * has no hole, it is ensured that the raw size of a compressed block image is
> + * equal to BLCKSZ, hence the contents of XLogRecordBlockImageCompressionInfo
> + * are not necessary.
> */
> typedef struct XLogRecordBlockImageHeader
> {
> - uint16 hole_offset; /* number of bytes before "hole" */
> - uint16 hole_length; /* number of bytes in "hole" */
> + uint16 length:15, /* length of block data in record */
> + with_hole:1; /* status of hole in the block */
> +
> + uint16 hole_offset:15, /* number of bytes before "hole" */
> + is_compressed:1; /* compression status of image */
> +
> + /* Followed by the data related to compression if block is compressed */
> } XLogRecordBlockImageHeader;

Yikes, this is ugly.

I think we should change the xlog format so that the block_id (which
currently is XLR_BLOCK_ID_DATA_SHORT/LONG or a actual block id) isn't
the block id but something like XLR_CHUNK_ID. Which is used as is for
XLR_CHUNK_ID_DATA_SHORT/LONG, but for backup blocks can be set to to
XLR_CHUNK_BKP_WITH_HOLE, XLR_CHUNK_BKP_COMPRESSED,
XLR_CHUNK_BKP_REFERENCE... The BKP blocks will then follow, storing the
block id following the chunk id.

Yes, that'll increase the amount of data for a backup block by 1 byte,
but I think that's worth it. I'm pretty sure we will be happy about the
added extensibility pretty soon.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-02-16 12:08:09 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Michael Paquier 2015-02-16 11:55:20 Re: [REVIEW] Re: Compression of full-page-writes