Re: documentation on HOT

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Pg Docs <pgsql-docs(at)lists(dot)postgresql(dot)org>
Subject: Re: documentation on HOT
Date: 2022-07-22 18:08:24
Message-ID: YtrnmPQR4wYR17YE@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Fri, Jul 22, 2022 at 09:25:43AM -0700, David G. Johnston wrote:
> On Fri, Jul 22, 2022 at 8:09 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
> I think we need to expose the information regarding columns used in predicates
> here.
>
> "(Here, "indexed column" means any column referenced
> at all in an index definition, including for example columns that are
> tested in a partial-index predicate but are not stored in the index.)"

Okay, I clarified this in the attached patch.

> I get it is an implementation detail but explaining the name seems like a good
> thing to do as well:
>
> "Without HOT, every version of a row in an update chain has its own index
> entries, even if all indexed columns are the same.  With HOT, a new tuple
> placed on the same page and with all indexed columns the same as its
> parent row version does not get new index entries.  This means there is
> only one index entry for the entire update chain on the heap page.
> An index-entry-less tuple is marked with the HEAP_ONLY_TUPLE flag."

I don't see how the chain is useful for people trying to understand how
to benefit from this feature.

> Where the last sentence becomes: "Those index-entry-less tuples (yeah, still
> dislike triple-hypenation...) are thus named "Heap-Only Tuples".
>
> (I've actually incorporated this as I think it should be down below, as a
> lead-in to the listing of conditions for when the optimization can be used.)
>
> Then maybe "can be removed during select" should be reworded as:
>
> "No longer visible heap-only tuples can be removed during normal
> operation, including <command>SELECT</command>s, instead of requiring
> periodic vacuum operations."

I added a no-longer-visible qualifier to the patch.

> The original heap entry the index points to cannot be removed. "Old versions of
> heap-only tuples" vs. "No longer visible heap-only tuples" is probably a style
> choice.  There are basically three different "versions" in context here though
> so avoiding "old versions" has some appeal to me.
>
> I'm not a fan of:
>
> "Fortunately, there is an automatic system..."
>
> I'd like to give credit to the fact we engineered a solution to the downsides,
> so change the lead-in paragraph to the conditions listing to be:

Yeah, good point. We didn't stumble upon this feature. I have adjusted
that wording.

> "To mitigate these downsides PostgreSQL implements an optimization whereby
> sometimes only the heap tuple is created, not the index entry, when performing
> an update.  In a case of giving things obvious and meaningful names, this is
> the Heap-Only Tuple (HOT) Optimization.  This optimization is possible when:"

Sorry, I don't like the above since it isn't precise and the "In a case
of giving things obvious and meaningful names" seems odd.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson

Attachment Content-Type Size
hot.diff text/x-diff 7.3 KB

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2022-07-22 19:04:57 Re: documentation on HOT
Previous Message Bruce Momjian 2022-07-22 17:07:41 Re: documentation on HOT