Re: Re: A separate table level option to control compression

From: Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: Re: A separate table level option to control compression
Date: 2019-03-20 21:40:48
Message-ID: CAG1YDPf6GibkgEamaZyKRT_QEX+9isnt_9zNXbr3uW8gVmBDyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jumping in here, please be gentle. :)

Contents & Purpose
==================

This appears to be a patch to add a new table storage option similar to
`toast_tuple_target` but geared toward compression. As a result, it's been
named `compress_tuple_target`, and allows modifying the threshold where
inline tuples actually become compressed. If we're going to make the toast
threshold configurable, it tends to make sense we'd do the same for the
compression threshold.

The patch includes necessary documentation to describe the storage parameter
along with limitations and fallback operating modes. Several tests are also
included.

Verification Procedure
======================

The patch applied clean to HEAD, which was at commit 28988a84c... by Peter
Eisentraut, at the time of this review.

Build succeeded without issue, as did `make check` and `make installcheck`.
In addition, I also performed the following manual verification steps
using table page count and `pgstattuple` page distribution for a 10-row
table with a junk column in these scenarios:

* A standard table with a 1000-byte junk column not using this option:
2 pages at 66% density
* A table with a 1000-byte junk and `compress_tuple_target` set to 512:
1 page at 6% density; the low threshold activated compression
* A table with a 8120-byte junk and `compress_tuple_target` +
`toast_tuple_target` set to 8160. Further, junk column was set to
`main` storage to prevent standard toast thresholds from interfering:
10 pages at 99.5% density; no compression activated despite large column
* A table with a 8140-byte junk and `compress_tuple_target` +
`toast_tuple_target` set to 8180. Further, junk column was set to
`main` storage to prevent standard toast thresholds from interfering:
1 page at 16% density; compression threshold > 8160 ignored as documented.
Additionally, neither `compress_tuple_target` or `toast_tuple_target`
were saved in `pg_class`.

I also performed a `pg_dump` of a table using `compress_tuple_target`,
and dump faithfully preserved the option in the expected location before
the DATA portion.

Discussion
==========

Generally this ended about as I expected. I suspect much of the existing
code was cribbed from the implementation of `toast_tuple_target` given
the similar entrypoints and the already existing hard-coded compression
thresholds.

I can't really speak for the discussion related to `storage.sgml`, but
I based my investigation on the existing patch to `create_table.sgml`.
About the only thing I would suggest there is to possibly tweak the
wording.

* "The compress_tuple_target ... " for example should probably read
"The toast_tuple_target parameter ...".
* "the (blocksize - header)" can drop "the".
* "If the value is set to a value" redundant wording should be rephrased;
"If the specified value is greater than toast_tuple_target, then
we will substitute the current setting of toast_tuple_target instead."
would work.
* I'd recommend a short discussion on what negative consequences can be
expected by playing with this value. As an example in my tests, setting
it very high may result in extremely sparse pages that could have an
adverse impact on HOT updates.

Still, those are just minor nitpicks, and I don't expect that to affect the
quality of the patch implementation.

Good show, gents!

--
Shaun M Thomas - 2ndQuadrant
PostgreSQL Training, Services and Support
shaun(dot)thomas(at)2ndquadrant(dot)com | www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-03-20 21:44:30 Re: [survey] New "Stable" QueryId based on normalized query text
Previous Message legrand legrand 2019-03-20 21:30:33 Re: [survey] New "Stable" QueryId based on normalized query text