Re: libpq compression

From: Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: libpq compression
Date: 2020-11-26 13:15:42
Message-ID: 6811D196-E2FB-40CC-B1C9-F19427FBF675@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> On Nov 24, 2020, at 11:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> So the time to talk about the
> general approach here is now, before anything gets committed, before
> the project has committed itself to any particular design. If we
> decide in that discussion that certain things can be left for the
> future, that's fine. If we've have discussed how they could be added
> without breaking backward compatibility, even better. But we can't
> just skip over having that discussion.

> If the client requests compression and the server supports it, it
> should return a new SupportedCompressionTypes message following
> NegotiateProtocolMessage response. That should be a list of
> compression methods which the server understands. At this point, the
> clent and the server each know what methods the other understands.
> Each should now feel free to select a compression method the other
> side understands, and to switch methods whenever desired, as long as
> they only select from methods the other side has said that they
> understand. The patch seems to think that the compression method has
> to be the same in both directions and that it can never change, but
> there's no real reason for that. Let each side start out uncompressed
> and then let it issue a new SetCompressionMethod protocol message to
> switch the compression method whenever it wants. After sending that
> message it begins using the new compression type. The other side
> doesn't have to agree. That way, you don't have to worry about
> synchronizing the two directions. Each side is just telling the other
> what is choosing to do, from among the options the other side said it
> could understand.

I’ve read your suggestions about the switchable on-the-fly independent for each direction compression.

While the proposed protocol seems straightforward, the ability to switch compression mode in an arbitrary moment significantly complexifies the implementation which may lead to the lower adoption of the really useful feature in custom frontends/backends.

However, I don’t mean by this that we shouldn’t support switchable compression. I propose that we can offer two compression modes: permanent (which is implemented in the current state of the patch) and switchable on-the-fly. Permanent compression allows us to deliver a robust solution that is already present in some databases. Switchable compression allows us to support more complex scenarios in cases when the frontend and backend really need it and can afford development effort to implement it.

I’ve made a draft of the protocol that may cover both these compression modes, also the following protocol supports independent frontend and backend compression.

In StartupPacket _pq_.compression frontend will specify the:

1. Supported compression modes in the order of preference.
For example: “permanent, switchable” means that the frontend supports both permanent and switchable modes and prefer to use the permanent mode.

2. List of the compression algorithms which the frontend is able to decompress in the order of preference.
For example:
“zlib:1,3,5;zstd:7,8;uncompressed” means that frontend is able to:
- decompress zlib with 1,3 or 5 compression levels
- decompress zstd with 7 or 8 compression levels
- “uncompressed” at the end means that the frontend agrees to receive uncompressed messages. If there is no “uncompressed” compression algorithm specified it means that the compression is required.

After receiving the StartupPacket message from the frontend, the backend will either ignore the _pq_.compression as an unknown parameter (if the backend is before November 2017) or respond with the CompressionAck message which will include:

1. Index of the chosen compression mode or -1 if doesn’t support any of the compression modes send by the frontend.
In the case of the startup packet from the previous example:
It may be ‘0’ if the server chose permanent mode,’1’ if switchable, or ‘-1’ if the server doesn’t support any of these.

2. List of the compression algorithms which the backend is able to decompress in the order of preference.
For example, “zstd:2,4;uncompressed;zlib:7” means that the backend is able to:
-decompress zstd with 2 and 4 compression levels
-work in uncompressed mode
-decompress zlib with compression level 7

After sending the CompressionAck message, the backend will also send the SetCompressionMessage with one of the following:
- Index of the chosen backend compression algorithm followed by the index of the chosen compression level. In this case, the frontend now should use the chosen decompressor for incoming messages, the backend should also use the chosen compressor for outgoing messages.
- '-1', if the backend doesn’t support the compression using any of the algorithms sent by the frontend. In this case, the frontend must terminate the connection after receiving this message.

After receiving the SetCompressionMessage from the backend, the frontend should also reply with SetCompressionMessage with one of the following:
- Index of the chosen frontend compression algorithm followed by the index of the chosen compression level. In this case, the backend now should use the chosen decompressor for incoming messages, the frontend should also use the chosen compressor for outgoing messages.
- '-1', if the frontend doesn’t support the compression using any of the algorithms sent by the backend. In this case, the frontend should terminate the connection after sending this message.

After that sequence of messages, the frontend and backend may continue the usual conversation. In the case of permanent compression mode, further use of SetCompressionMessage is prohibited both on the frontend and backend sites.
Supported compression and decompression methods are configured using GUC parameters:

compress_algorithms = ‘...’ // default value is ‘uncompressed’
decompress_algorithms = ‘...’ // default value is ‘uncompressed’

Please, let me know if I was unclear somewhere in the protocol description so I can clarify the things that I might have missed. I would appreciate hearing your opinion on the proposed protocol.

Thanks,

Daniil Zakhlystov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-11-26 13:27:12 Re: Improper use about DatumGetInt32
Previous Message Euler Taveira 2020-11-26 13:14:37 Re: cleanup temporary files after crash