Re: Optimize external TOAST storage

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: davinder singh <davindersingh2692(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optimize external TOAST storage
Date: 2022-03-15 21:48:13
Message-ID: 20220315214813.GB1087373@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 16, 2022 at 01:03:38AM +0530, davinder singh wrote:
> Regarding the 2nd part of configuring the threshold, Based on our
> experiments, we have fixed it for the attributes with size > 2 * chunk_size.
> The default chunk_size is 2KB and the page size is 8KB. While toasting each
> attribute is divided into chunks, and each page can hold a max of 4 such
> chunks.
> We only need to think about the space used by the last chunk of the
> attribute.
> This means with each value optimization, it might use extra space in the
> range
> (0B,2KB]. I think this extra space is independent of attribute size. So we
> don't
> need to worry about configuring this threshold. Let me know if I missed
> something
> here.

Looking closer, ISTM there are two thresholds here. The first threshold is
the minimum size of the uncompressed data required to trigger this
optimization. The purpose behind this threshold is to avoid increasing
disk space usage in return for very little gain in the read path (i.e.,
skipping decompression). Specifically, this threshold helps avoid
increasing disk space usage when there are many small TOAST attributes.
You've chosen 2x the TOAST chunk size as the threshold.

The second threshold is the compression ratio required to trigger this
optimization. The purpose behind this threshold is to avoid storing poorly
compressed data so that we can improve performance in the read path (i.e.,
skipping decompression). You've chosen >=1 TOAST chunk as the threshold.
This means that when I have an attribute with size 3x a TOAST chunk, I'll
only store it compressed when the compressed attribute is 2x a TOAST chunk
or smaller (a 33% improvement). If I have an attribute with size 100x a
TOAST chunk, I'll store it compressed when the compressed attribute is 99x
a TOAST chunk or smaller (a 1% improvement).

IIUC this means that it is much more likely that you will store a
compressed attribute when the uncompressed data is larger. The benefit of
this approach is that you know you'll never give up more than one TOAST
chunk of disk space in the name of faster reads. The downside is that a
huge number of smaller attributes can require much more disk space than if
they were universally compressed.

Perhaps this is missing one additional threshold. Perhaps there should be
a compression ratio for which the optimization is ignored. If my
uncompressed data is 3x a TOAST chunk and compressing it would save half of
a TOAST chunk of disk space (a ~17% improvement), maybe we should still
store it compressed. I think this threshold would be most important for
servers with many small TOAST attributes. I don't know whether it is
necessary to make all of this configurable or whether we should just use
some heuristics to find a good balance. In any case, I imagine the
settings should look different for the different TOAST compression methods.

Or maybe we should just make the first threshold the configuration option
for this feature. If you set it to -1, the optimization is never used. If
you set it to 0, you always try to save on read time at the expense of disk
space. If you set it to 2x the TOAST chunk size, you might use up to 50%
more disk space for read path optimizations. If you set it to 100x the
TOAST chunk size, you might use up to 1% more disk space for read
optimizations. By default, this configuration option could be set
conservatively (e.g., 10x a TOAST chunk, which could lead to a maximum of
10% more disk space). I suspect the increase in disk space would be much
less than these numbers in most cases, but it might be helpful to think of
these things in terms of the worst case scenario.

I apologize for thinking out loud a bit here, but I hope this gives you
some insight into my perspective on this. In general, I am skeptical that
we can choose one threshold that will work for all PostgreSQL installations
in the known universe.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-03-15 21:48:24 Re: do only critical work during single-user vacuum?
Previous Message Jacob Champion 2022-03-15 21:41:49 Re: [PATCH] Accept IP addresses in server certificate SANs