Re: Teach pg_receivewal to use lz4 compression

From: gkokolatos(at)pm(dot)me
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Teach pg_receivewal to use lz4 compression
Date: 2021-10-29 09:45:41
Message-ID: 8CDVNs7db0aITUJjQ8TkK6pX_mO3vTVdpr0niTKodjcNw8DnUIK1fUWHeTYf9oYQP3YfizJT5XevtKdp7xJ04LU_yRI7MY8pPutua3eQAUo=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Saturday, September 18th, 2021 at 8:18 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Fri, Sep 17, 2021 at 08:12:42AM +0000, gkokolatos(at)pm(dot)me wrote:
>
> I have been digging into the issue I saw in the TAP tests when closing
> a segment, and found the problem. The way you manipulate
> frameInfo.contentSize by just setting it to WalSegSz when *opening*
> a segment causes problems on LZ4F_compressEnd(), making the code
> throw a ERROR_frameSize_wrong. In lz4frame.c, the end of
> LZ4F_compressEnd() triggers this check and the error:
> if (cctxPtr->prefs.frameInfo.contentSize) {
> if (cctxPtr->prefs.frameInfo.contentSize != cctxPtr->totalInSize)
> return err0r(LZ4F_ERROR_frameSize_wrong);
> }
>
> We don't really care about contentSize as long as a segment is not
> completed. Rather than filling contentSize all the time we write
> something, we'd better update frameInfo once the segment is
> completed and closed. That would also take take of the error as this
> is not checked if contentSize is 0. It seems to me that we should
> fill in the information when doing a CLOSE_NORMAL.

Thank you for the comment. I think that the opposite should be done. At the time
that the file is closed, the header is already written to disk. We have no way
to know that is not. If we need to go back to refill the information, we will
have to ask for the API to produce a new header. There is little guarantee that
the header size will be the same and as a consequence we will have to shift
the actual data around.

In the attached, the header is rewritten only when closing an incomplete
segment. For all intents and purposes that segment is not usable. However there
might be custom scripts that might want to attempt to parse even an otherwise
unusable file.

A different and easier approach would be to simply prepare the LZ4 context for
future actions and simply ignore the file.

>
> - if (stream->walmethod->compression() == 0 &&
> + if (stream->walmethod->compression() == COMPRESSION_NONE &&
> stream->walmethod->existsfile(fn))
> This one was a more serious issue, as the compression() callback would
> return an integer for the compression level but v5 compared it to a
> WalCompressionMethod. In order to take care of this issue, mainly for
> pg_basebackup, I think that we have to update the compression()
> callback to compression_method(), and it is cleaner to save the
> compression method as well as the compression level for the tar data.
>

Agreed.

> I am attaching a new patch, on which I have done many tweaks and
> adjustments while reviewing it. The attached patch fixes the second
> issue, and I have done nothing about the first issue yet, but that
> should be simple enough to address as this needs an update of the
> frame info when closing a completed segment. Could you look at it?
>

Thank you. Find v7 attached, rebased to the current head.

Cheers,
//Georgios

> Thanks,
> --
> Michae

Attachment Content-Type Size
v7-0001-Teach-pg_receivewal-to-use-LZ4-compression.patch text/x-patch 36.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Mariadasan (jomariad) 2021-10-29 10:40:06 Vulnerability identified with Postgres 13.4 for Windows
Previous Message Yugo NAGATA 2021-10-29 09:16:28 Re: Implementing Incremental View Maintenance