Re: Expanding HOT updates for expression and partial indexes

From: Greg Burd <greg(at)burd(dot)me>
To: pgsql-hackers(at)postgresql(dot)org <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expanding HOT updates for expression and partial indexes
Date: 2025-11-19 18:21:51
Message-ID: 092E6AFE-81DA-4869-91C7-8F9F5A7541E5@greg.burd.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Nov 19 2025, at 1:00 pm, Greg Burd <greg(at)burd(dot)me> wrote:

>
> On Nov 16 2025, at 1:53 pm, Greg Burd <greg(at)burd(dot)me> wrote:
>
>> 0004 - Enable HOT updates for expression and partial indexes
>>
>> This finally gets us back to where this project started, but on much
>> more firm ground than before because we're not going to
>> self-deadlock.
>> The idea has grown from a small function into something larger, but only
>> out of necessity.
>>
>> In this patch I add ExecWhichIndexesRequireUpdates() in execIndexing.c
>> which implements (c) finding the set of attributes that force new index
>> updates. This set can be very different from the modified indexed
>> attributes. We know that some attributes are not equal to their
>> previous versions, but does that mean that the index that references
>> that attribute needs a new index tuple? It may, or it may not. Here's
>> the comment on that function that explains:
>>
>> /*
>> * ExecWhichIndexesRequireUpdates
>> *
>> * Determine which indexes need updating given modified indexed attributes.
>> * This function is a companion to ExecCheckIndexedAttrsForChanges().
>> On the
>> * surface, they appear similar but they are doing two very different things.
>> *
>> * For a standard index on a set of attributes this is the
>> intersection of
>> * the mix_attrs and the index attrs (key, expression, but not predicate).
>> *
>> * For expression indexes and indexes which implement the amcomparedatums()
>> * index AM API we'll need to form index datum and compare each
>> attribute to
>> * see if any actually changed.
>> *
>> * For expression indexes the result of the expression might not change
>> at all,
>> * this is common with JSONB columns which require expression indexes
>> and where
>> * it is commonplace to index a field within a document and have
>> updates that
>> * generally don't update that field.
>> *
>> * Partial indexes won't trigger index tuples when the old/new tuples
>> are both
>> * outside of the predicate range.
>> *
>> * For nbtree the amcomparedatums() API is critical as it requires
>> that key
>> * attributes are equal when they memcmp(), which might not be the
>> case when
>> * using type-specific comparison or factoring in collation which
>> might make
>> * an index case insensitive.
>> *
>> * All of this is to say that the goal is for the executor to know,
>> ahead of
>> * calling into the table AM for the update and before calling into
>> the index
>> * AM for inserting new index tuples, which attributes at a minimum will
>> * necessitate a new index tuple.
>> *
>> ...
>> */
>
> Attached are rebased (d5b4f3a6d4e) patches with the only changes
> happening in the last patch in the series.
>
> 0004 - Enable HOT updates for expression and partial indexes
>
> I was never happy with the dual functions
> ExecCheckIndexedAttrsForChanges() and ExecWhichIndexesRequireUpdates(),
> it felt like too much overhead and duplication of effort. While
> updating my tests, adding a few cases, I found that there was also a
> flaw in the logic. So, time to rewrite and combine them.
>
> What did I discover? Before the logic was to find the set of modified
> indexed attributes then review all the indexes for changed attributes
> using FormIndexDatum() and comparing before/after to see if expressions
> really changed the value to be indexed or not. The first pass didn't
> take into account expressions, the second did. So, an expression index
> over JSONB data wouldn't extract and test the field within the document,
> it was just comparing the entire document before/after using the jsonb
> comparison function, no bueno.
>
> This approach wraps both functions into one somewhat simplified
> function. The logic is basically, iterate over the indexes reviewing
> indexed attributes for changes. Along the way we call into the new
> index AM's comparison function when present, otherwise we find and use
> the proper type-specific comparison function for the datum. At the end
> of the function we have our Bitmapset of attributes that should trigger
> new index tuples.
>
>> What's left undone?
>>
>> * I need to check code coverage so that I might
>
> I did this and it was quite good, I'll do it again for this new series
> but it's nice to see that the tests are exercising the vast majority of
> the code paths.
>
>> * create tests covering all the new cases
>
> I think the coverage is good, maybe even redundant or overly complex
> in places.
>
>> * update the README.HOT documentation, wiki, etc.
>
> Soon, I hope to have this approach solid and under review before
> solidifying the docs.
>
>> * performance...
>
> Still as yet unmeasured, I know that there is more work per-update to
> perform these checks, so some overhead, but I don't know if that
> overhead is more than before with HeapDetermineColumnsInfo() and
> index_unchanged_by_update(). Those two functions did essentially the
> same thing, only with binary comparison (datumIsEqual()). I need to
> measure that. What about doing all this work outside of the buffer lock
> in heap_update()? Surely that'll give back a bit or at least add to
> concurrency. Forming index tuples a few extra times and evaluating the
> expressions 3 times rather than 1 is going to hurt, I think I can come
> up with a way to cache the formed datum and use it later on, but is that
> worth it? Complex expressions, yes. Also, what about expressions that
> expect to be executed once... and now are 3x? That's what forced my
> update to the insert-conflict-specconflict.out test, but AFAICT there is
> no way to test if an expression's value is going to change on update
> without exercising it once for the old tuple and once for the new tuple.
> Even if it were possible for an index to provide the key it might have
> changed after the expression evaluation (as is the case in hash), so I
> don't think this is avoidable. Maybe that's reason enough to add a
> reloption to disable the expression evaluation piece of this? Given
> that it might create a logic or performance regression. The flip side
> is the potential to use the HOT path, that's a real savings.
>
> One concerning thing is that nbtree's assumption that key attributes for
> TIDs must use binary comparison for equality. This means that for our
> common case (heap/btree) there is more work per-update than before,
> which is why I need to measure. I could look into eliminating the
> nbtree requirement, I don't understand it too well as yet by I believe
> that on page split there is an attempt to deduplicate TIDs into a
> TIDBitmap and the test for when that's possible is datumIsEqual(). If
> that were the same as in this new code, possibly evening using
> tts_attr_equal(), then... I don't know, I'll have to investigate. Chime
> in here if you can educate me on this one. :)
>
> best.
>
> -greg

Doh!

I forgot to commit the fixed regression test expected output before
formatting the patch set, here it is.

-greg

Attachment Content-Type Size
v22-0001-Reorganize-heap-update-logic.patch application/octet-stream 47.6 KB
v22-0002-Track-changed-indexed-columns-in-the-executor-du.patch application/octet-stream 34.3 KB
v22-0003-Replace-index_unchanged_by_update-with-ri_Change.patch application/octet-stream 8.3 KB
v22-0004-Enable-HOT-updates-for-expression-and-partial-in.patch application/octet-stream 130.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2025-11-19 18:42:26 Avoid unnecessary processing of DISTINCT clause (Was: Unique Keys)
Previous Message Jacob Champion 2025-11-19 18:05:46 Re: PRI?64 vs Visual Studio (2022)