Re: [Proposal] Page Compression for OLTP

From: chenhj <chjischj(at)163(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, david(at)fetter(dot)org
Subject: Re: [Proposal] Page Compression for OLTP
Date: 2022-07-26 17:47:04
Message-ID: 53eb36f3.4b.1823b9e93ae.Coremail.chjischj@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I have rebase this patch and made some improvements.

1. A header is added to each chunk in the pcd file, which records the chunk of which block the chunk belongs to, and the checksum of the chunk.

Accordingly, all pages in a compressed relation are stored in compressed format, even if the compressed page is larger than BLCKSZ.

The maximum space occupied by a compressed page is BLCKSZ + chunk_size (exceeding this range will report an error when writing the page).

2. Repair the pca file through the information recorded in the pcd when recovering from a crash

3. For compressed relation, do not release the free blocks at the end of the relation (just like what old_snapshot_threshold does), reducing the risk of data inconsistency between pcd and pca file.

4. During backup, only check the checksum in the chunk header for the pcd file, and avoid assembling and decompressing chunks into the original page.

5. bugfix, doc, code style and so on

And see src/backend/storage/smgr/README.compression for detail

Other

1. remove support of default compression option in tablespace, I'm not sure about the necessity of this feature, so don't support it for now.

2. pg_rewind currently does not support copying only changed blocks from pcd file. This feature is relatively independent and could be implemented later.

Best Regard

Chen Huajun

At 2021-02-18 23:12:57, "David Fetter" <david(at)fetter(dot)org> wrote:
>On Tue, Feb 16, 2021 at 11:15:36PM +0800, chenhj wrote:
>> At 2021-02-16 21:51:14, "Daniel Gustafsson" <daniel(at)yesql(dot)se> wrote:
>>
>> >> On 16 Feb 2021, at 15:45, chenhj <chjischj(at)163(dot)com> wrote:
>> >
>> >> I want to know whether this patch can be accepted by the community, that is, whether it is necessary for me to continue working for this Patch.
>> >> If you have any suggestions, please feedback to me.
>> >
>> >It doesn't seem like the patch has been registered in the commitfest app so it
>> >may have been forgotten about, the number of proposed patches often outnumber
>> >the code review bandwidth. Please register it at:
>> >
>> > https://commitfest.postgresql.org/32/
>> >
>> >..to make sure it doesn't get lost.
>> >
>> >--
>>
>> >Daniel Gustafsson https://vmware.com/
>>
>>
>> Thanks, I will complete this patch and registered it later.
>> Chen Huajun
>
>The simplest way forward is to register it now so it doesn't miss the
>window for the upcoming commitfest (CF), which closes at the end of
>this month. That way, everybody has the entire time between now and
>the end of the CF to review the patch, work on it, etc, and the CF bot
>will be testing it against the changing code base to ensure people
>know if such a change causes it to need a rebase.
>
>Best,
>David.
>--
>David Fetter <david(at)fetter(dot)org> http://fetter.org/
>Phone: +1 415 235 3778
>
>Remember to vote!
>Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
v1-0001-page_compression_16-README.patch application/octet-stream 10.5 KB
v1-0002-page_compression_16-doc.patch application/octet-stream 13.4 KB
v1-0003-page_compression_16-main.patch application/octet-stream 138.4 KB
v1-0004-page_compression_16-main-test.patch application/octet-stream 108.0 KB
v1-0005-page_compression_16-bin.patch application/octet-stream 24.0 KB
v1-0006-page_compression_16-pageinspect.patch application/octet-stream 16.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-07-26 17:50:32 Re: predefined role(s) for VACUUM and ANALYZE
Previous Message Andres Freund 2022-07-26 17:40:22 Re: failures in t/031_recovery_conflict.pl on CI