Re: libpq compression

From: Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Subject: Re: libpq compression
Date: 2020-11-24 17:34:39
Message-ID: 160623927937.7563.15974279758478581796.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: tested, failed
Documentation: tested, failed

Submission review
--
Is the patch in a patch format which has context? (eg: context diff format)
NO, need to fix
Does it apply cleanly to the current git master?
YES
Does it include reasonable tests, necessary doc patches, etc?
have docs, missing tests

Usability review
--
At the moment, the patch supports per-connection (permanent) compression. The frontend can specify the desired compression algorithms and compression levels and then negotiate the compression algorithm that is going to be used with the backend. In current state patch is missing the ability to enable/disable the compression on the backend side, I think it might be not great from the usability side.

Regarding on-the-fly configurable compression and different compression algorithms for each direction - these two ideas are promising but tend to make the implementation more complex. However, the current implementation can be extended to support these approaches in the future. For example, we can specify switchable on-the-fly compression as ‘switchable’ algorithm and negotiate it like the regular compression algorithm (like we currently negotiate ‘zstd’ and ‘zlib’). ‘switchable’ algorithm may then introduce new specific messages to Postgres protocol to make the on-the-fly compression magic work.

The same applies to Robert’s idea of the different compression algorithms for different directions - we can introduce it later as a new compression algorithm with new specific protocol messages.

Does the patch actually implement that?
YES

Do we want that?
YES

Do we already have it?
NO

Does it follow SQL spec, or the community-agreed behavior?
To be discussed

Does it include pg_dump support (if applicable)?
not applicable

Are there dangers?
theoretically possible CRIME-like attack when using with SSL enabled

Have all the bases been covered?
To be discussed

Feature test
--

I’ve applied the patch, compiled, and tested it with configure options --enable-cassert and --enable-debug turned on. I’ve tested the following scenarios:

1. make check
=======================
All 201 tests passed.
=======================

2. make check-world
initially failed with:
============== running regression test queries ==============
test postgres_fdw ... FAILED 4465 ms
============== shutting down postmaster ==============
======================
1 of 1 tests failed.
======================
The differences that caused some tests to fail can be viewed in the
file "/xxx/xxx/review/postgresql/contrib/postgres_fdw/regression.diffs". A copy of the test summary that you see above is saved in the file "/xxx/xxx/review/postgresql/contrib/postgres_fdw/regression.out".

All tests passed after replacing ‘gsslib, target_session_attrs’ with ‘gsslib, compression, target_session_attrs’ in line 8914 of postgresql/contrib/postgres_fdw/expected/postgres_fdw.out

3. simple psql utility usage
psql -d "host=xxx port=5432 dbname=xxx user=xxx compression=1"

4. pgbench tpcb-like w/ SSL turned ON
pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=require compression=1" --builtin tpcb-like -t 70 --jobs=32 --client=700

5. pgbench tpcb-like w/ SSL turned OFF
pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=disable compression=1" --builtin tpcb-like -t 70 --jobs=32 --client=700

6. pgbench initialization w/ SSL turned ON
pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=require compression=1" -i -s 500

7. pgbench initialization w/ SSL turned OFF
pgbench "host=xxx port=5432 dbname=xxx user=xxx sslmode=disable compression=1" -i -s 500

8. Streaming physical replication. Recovery-related parameters
recovery_target_timeline = 'latest'
primary_conninfo = 'host=xxx port=5432 user=repl application_name=xxx compression=1'
primary_slot_name = 'xxx'
restore_command = 'some command'

9. This compression has been implemented in an experimental build of odyssey connection pooler and tested with ~1500 synthetic simultaneous clients configuration and ~300 GB databases.
During the testing, I’ve reported and fixed some of the issues.

Does the feature work as advertised?
YES
Are there corner cases the author has failed to consider?
NO
Are there any assertion failures or crashes?
NO

Performance review
--

Does the patch slow down simple tests?
NO

If it claims to improve performance, does it?
YES

Does it slow down other things?
Using compression may add a CPU overhead. This mostly depends on compression algorithm and chosen compression level. During testing with ZSTD algorithm and compression level 1 there was about 10% of CPU overhead in read/write balanced scenarios and almost no overhead in mostly read scenarios.

Coding review
--

In protocol.sgml:
> It can be just boolean values enabling or disabling compression
> ("true"/"false", "on"/"off", "yes"/"no", "1"/"0"), "auto" or explicit list of compression algorithms
> separated by comma with optional specification of compression level: "zlib,zstd:5".

But in fe-protocol3.c:
> if (pg_strcasecmp(value, "true") == 0 ||
> pg_strcasecmp(value, "yes") == 0 ||
> pg_strcasecmp(value, "on") == 0 ||
> pg_strcasecmp(value, "any") == 0 ||
> pg_strcasecmp(value, "1") == 0)
> {
I believe there is some mismatch - in docs, there is an “auto” parameter, but in code “auto” is missing, but “any” exists. Actually, I propose to remove both “auto” and “any” parameters because they work the same way as “true/on/yes/1” but appear like something else.

In fe-protocol3.c:
>#define pq_read_conn(conn) \
> (conn->zstream \
> ? zpq_read(conn->zstream, conn->inBuffer + conn->inEnd, \
> conn->inBufSize - conn->inEnd) \
> : pqsecure_read(conn, conn->inBuffer + conn->inEnd, \
> conn->inBufSize - conn->inEnd))
I think there should be some comment regarding the read function choosing logic. Same for zpq_write calls. Also, pq_read_conn is defined as a macros, but there is no macros for pq_write_conn.

In configure.ac:
> if test "$with_zstd" = yes; then
> AC_CHECK_LIB(zstd, ZSTD_decompressStream, [],
> [AC_MSG_ERROR([zstd library not found
> If you have zstd already installed, see config.log for details on the
> failure. It is possible the compiler isn't looking in the proper directory.
> Use --without-zstd to disable zstd support.])])
> fi

> if test "$with_zstd" = yes; then
> AC_CHECK_HEADER(zstd.h, [], [AC_MSG_ERROR([zstd header not found
> If you have zstd already installed, see config.log for details on the
> failure. It is possible the compiler isn't looking in the proper directory.
> Use --without-zstd to disable zstd support.])])
> fi

Looks like the rows with --without-zstd are incorrect.

In fe-connect.c:
> if (index == (char)-1)
> {
> appendPQExpBuffer(&conn->errorMessage,
> libpq_gettext(
> "server is not supported requested compression algorithms %s\n"),
> conn->compression);
> goto error_return;
> }
Right now this error might be displayed in two cases:
Backend support compression, but it is somehow disabled/turned off
Backend support compression, but does not support requested algorithms

I think that it is a good idea to differentiate these two cases. Maybe define the following behavior somewhere in docs:

“When connecting to an older backend, which does not support compression, or in the case when the backend support compression but for some reason wants to disable it, the backend will just ignore the _pq_.compression parameter and won’t send the compressionAck message to the frontend.”

To sum up, I think that the current implementation already introduces good benefits. As I proposed in the Usability review, we may introduce the new approaches later as separate compression 'algorithms'.

Thanks,
Daniil Zakhlystov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-11-24 17:38:30 Re: Online verification of checksums
Previous Message Tom Lane 2020-11-24 17:29:41 Re: Strange behavior with polygon and NaN