Re: Optimize external TOAST storage

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: davinder singh <davindersingh2692(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optimize external TOAST storage
Date: 2022-03-22 21:42:53
Message-ID: 20220322214253.GA1601968@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 22, 2022 at 04:34:05PM -0400, Robert Haas wrote:
> We seem to have a shortage of "others" showing up with opinions on
> this topic, but I guess I'm not very confident about the general
> utility of such a setting. Just to be clear, I'm also not very
> confident about the usefulness of the existing settings for
> controlling TOAST. Why is it useful default behavior to try to get
> rows down to 2kB by default, rather than 1.8kB or 3kB? Even more, why
> don't we try to compress primarily based on the length of individual
> attributes and then compress further only if the resulting tuple
> doesn't fit into a page at all? There doesn't seem to be anything
> magic about fitting tuples into a quarter-page, yet the code acts as
> though that's the primary goal - and then, when that didn't turn out
> to work well in all cases, we added a user-settable parameter
> (toast_tuple_target) to let you say you really want tuples in table X
> to fit into a third of a page or a fifth of a page instead of a
> quarter. And there's some justification for that: a proposal to
> fundamentally change the algorithm would likely have gotten bogged
> down for years, and the parameter no doubt lets you solve some
> problems. Yet if the whole algorithm is wrong, and I think maybe it
> is, then being able to change the constants is not really getting us
> where we need to be.
>
> Then, too, I'm not very confident about the usefulness of EXTENDED,
> EXTERNAL, and MAIN. I think it's useful to be able to categorically
> disable compression (as EXTERNAL does), but you can't categorically
> disable out-of-line storage because the value can be bigger than the
> page, so MAIN vs. EXTENDED is just changing the threshold for the use
> of out-of-line storage. However, it does so in a way that IMHO is not
> particularly intuitive, which goes back to my earlier point about the
> algorithm seeming wacky, and it's in any case unclear why we should
> offer exactly two possible thresholds and not any of the intermediate
> values.

I agree with all of this. Adding configurability for each constant might
help folks in the short term, but using these knobs probably requires quite
a bit of expertise in Postgres internals as well as a good understanding of
your data. I think we ought to revist TOAST configurability from a user
perspective. IOW what can be chosen automatically, and how do we enable
users to effectively configure the settings that cannot be chosen
automatically? IMO this is a worthwhile conversation to have as long as it
doesn't stray too far into the "let's rewrite TOAST" realm. I think there
is always going to be some level of complexity with stuff like TOAST, but
there are probably all sorts of ways to simplify/improve it also.

> Maybe the conclusion here is that more thought is needed before
> changing anything in this area....

You've certainly got me thinking more about this. If the scope of this
work is going to expand, a few months before the first v16 commitfest is
probably the right time for it.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message samay sharma 2022-03-22 21:48:08 Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message Andrew Dunstan 2022-03-22 21:37:50 Re: SQL/JSON: JSON_TABLE