Re: Expanding HOT updates for expression and partial indexes

From: "Greg Burd" <greg(at)burd(dot)me>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expanding HOT updates for expression and partial indexes
Date: 2026-02-17 21:15:02
Message-ID: e1d06b18-2ebb-466e-bc0f-5d426d17a32e@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Mon, Feb 16, 2026, at 2:36 PM, Jeff Davis wrote:
> On Sun, 2026-02-15 at 15:39 -0500, Greg Burd wrote:
>
>> (2) Types don't influence this decision today.  Their equality
>> operators are not used, attributes are memcmp().  This is a
>> requirement for index-only scans.  I think it's insufficient for
>> types like JSONB which have internal structure that can be extracted
>> to form index keys, but that's for a later commit.
>
> That's an interesting path but would require more infrastructure, and
> to justify going down that path we should look for other opportunities
> to use that type infra beyond just HOT. Brainstorming: maybe something
> in the planner can make use of intelligence around expressions that are
> effectively setter/accessor methods on complex types? (Obviously work
> for later.)

Agreed, but this is my best idea at the moment to re-introduce $subjet into this patch set. I'm open to other ideas, but fundamentally to allow $subjet requires that somewhere we discover that while portions of the JSONB attribute changed during the update the resulting index key attributes extracted from the JSONB did not.

In all patches v1-v29 that was discovered by evaluating the index expressions on the old and new tuples and comparing the index key datum that resulted for equality. When equal, while the attribute changed the indexes need not opening the door for a HOT update. The overhead of that wasn't horrible, but it does change some expectations about how often expressions are evaluated and that might be confusing.

>> I can see how this might be confusing, you're asking a good
>> question.  Why not just add the mix_attrs as an argument to the table
>> AM update call and be done?
>>
>> 1. Bitmapsets that are NULL mean empty, so how would
>> simple_heap_update() signal to heap_update() that it needs to
>> determine the modified indexed attributes?  We'd have to add a bool
>> along with the mix_attrs Bitmapset to indicate: "we've not calculated
>> the set yet, you need to do that."
>
> Maybe an extra bool is not ideal, but it's better than moving code onto
> the wrong side of an API boundary.

I'm not sure this is an "API boundary", but I'm open to debate on that. The attached patch does add a bool to heap_update() in addition to the Bitmapset of modified/indexed attributes (modified_attrs) and IMO is fine although I don't like that it further complicates (muddies?) an already huge and highly complicated function, heap_update().

>> 2. After fetching the exclusive buffer lock there is the test
>> `!ItemIdIsNormal(lp)` to cover the case where a simple_heap_update()
>> the otid origin is the syscache, there is no pin or snapshot, and so
>> there might be LP_* states other than LP_NORMAL due to concurrent
>> pruning.  This only happens when updating catalog tuples, so this
>> logic need not be present at all in the heapam_tuple_update().  Yes,
>> the if() branch will be fast (frequently predicted by the CPU) but
>> this feels like logic specific to the update of catalog tuples.
>
> If convenient I'm fine with moving that branch out, but I think it
> needs to be done with the buffer locked (right?), so heap_update()
> looks like the right place for that test for now.
>
>> 3. HeapDetermineColumnsInfo() actually does more than find the
>> modified indexed attributes, it also performs half of the check for
>> the requirement to WAL log the replica identity attributes.  The
>> replacement function in the executor doesn't do this work, so that is
>> coded into heapam_tuple_update() but not simple_heap_update().  The
>> second half is in ExtractReplicaIdentity() that happens later in the
>> heap_update() function after determining if HOT is a possibility or
>> not.
>
> IIUC you are saying that the decision is too heap-specific to expose to
> the executor. I think that's true today (as ExtractReplicaIdentity() is
> in heapam.c), but perhaps that's not fundamental: TOAST is not heap-
> specific, replica IDs are not heap-specific, and if you are WAL-logging
> a replica identity key it seems like you need to know whether it's
> external or not regardless of the AM.
>
> I'm not asking for a change here, just trying to understand the API
> boundaries.

I'm on the fence on this one, maybe all table AMs will need this same logic but at the moment it feels very heap-specific to me. For now it lives in heap_update() and HeapDetermineColumnsInfo(). I think I called this out only to say that if you look at the new vs old method for computing modified_attrs you'll see that's missing and done later in heap_update().

>> I have moved these changes back into heap_update(), add the mix_attrs
>> and mix_attrs_valid to see how things look, that's the attached
>> patch.
>
> Thank you -- that's easier to understand.

Excellent.

I've muddied the review again a bit by abstracting out a function ExecCompareSlots() which can loosely be thought of as a replacement for heap_attr_equal(). Doing that let's me vastly simplify the replication worker changes and reuse it after ExecBRUpdateTriggers().

> Why does simple_heap_update() need to do the HeapDetermineColumnsInfo()
> inside heap_update()? It seems like you're trying to avoid doing the
> same work the executor is doing to determine the modified_attrs bitmap,
> but either (a) the work is cheap; or (b) the work to make the bitmap is
> expensive.

simple_heap_update() is exclusively called during catalog tuple updates and does not involve the executor at all, these are direct calls into heap to store catalog tuples. These updates don't preserve the set of updated attributes (see cf-6221 [0]) and so for them to potentially use the HOT path in heap_update() we need to identify which attributes changed. This requires comparing the new heap tuple vs the one read from the buffer page. This is the same logic as before the patch. If I could come up with a simple method to retain the set of modified attributes during catalog tuple updates then we could excise HeapDetermineColumnsInfo() entirely.

> If (a), then just construct the correct bitmap in simple_heap_update()
> and simplify the code. If (b), then optimizing the simple_heap_update()
> case isn't good enough, we need to find ways of avoiding the work in
> the most common cases in the executor, as well.

Construct the correct bitmap how? That function is called with the otid and the updated HeapTuple, not enough to build the bitmap of modified indexed attributes.

I said in an earlier email that the results for the modified/indexed bitmap and the replica identity are identical. I added some test code (messy, but attached for your amusement should you want) that looks for differences between the two. It turns out that for replica identity the results are identical, for modified/indexed attributes there are a few differences.

In brin.sql at 409:
UPDATE brintest SET int8col = int8col * int4col;

This updates the indexed value of the int8col (attribute 4) which is correctly detected in the new code, but isn't in HeapDetermineColumnsInfo(). The "interesting_cols" are identical. I'm still investigating.

I'm also working on a good benchmark that hopefully can show that under heavy concurrent read/update mixed load the reduced exclusive lock time will allow more work to be performed. Some contrived benchmarks have shown 15% improvement, but I need to zero in on this before publishing results.

> Regards,
> Jeff Davis

Attached is a new version of the patch. The file heap-check.c contains the code I had been using to test for equal results across methods. That code is... well, it's just for me but I include it to show that I'm doing due diligence to ensure this stuff is either the same, or different for a good reason. Thanks for your continued interest in this work.

best.

-greg

[0] https://commitfest.postgresql.org/patch/6221/

Attachment Content-Type Size
v20260217-0001-Idenfity-modified-indexed-attributes-in-th.patch text/x-patch 31.6 KB
check-heapam.c text/x-csrc 5.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2026-02-17 21:18:00 Use standard die() handler for SIGTERM in bgworkers
Previous Message Nathan Bossart 2026-02-17 21:00:48 Re: add warning upon successful md5 password auth