Re: Expanding HOT updates for expression and partial indexes

From: "Greg Burd" <greg(at)burd(dot)me>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expanding HOT updates for expression and partial indexes
Date: 2026-02-15 20:39:42
Message-ID: da3400d5-e5e0-4d24-bd49-822c72c9894f@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Sat, Feb 14, 2026, at 2:39 PM, Jeff Davis wrote:
> On Fri, 2026-02-13 at 16:06 -0500, Greg Burd wrote:
>> Here's my thinking, this patch set can be thought of as:
>>
>> a) moving HeapDetermineColumnsInfo() into the executor

Hey Jeff, thanks for taking a look at this! :)

> This feels like the core of the series: moving the logic into the
> executor make it possible to be smarter about whether HOT can be
> applied or not.

Yes, but put a different way this change moves what is not heap-specific outside the heap and leaves heap-isms inside heap.

> I think that is a good direction to go. I don't think HOT is
> fundamentally a heap concept because other AMs may do something similar
> and would want similar information.

The ability for a table AM to skip writing new index entries for a subset of updates is the non-heap-specific feature I'm making more general in this patch. Heap calls this "HOT" (heap-only tuple) but while the name is a heap-ism, the concept isn't. Other table AM implementations might have different behavior as they might implement MVCC differently, have UNDO logs, or even be Index-oriented tables (IoT). What's baked into the code today is very tightly aligned with how heap works. The first thing I'd like to adjust is the identification of the "modified indexed attributes" on update. The second will be the TU_UpdateIndexes enum.

> There are two parts to the decision
> of whether to use HOT: first, is there a logical change to the indexed
> value; and second, does the AM need to break the change for some other
> reason (e.g. the current page is full).

I view this as, "agreeing on which indexes require index writes during an update" and I think there are three parts of the system that work together to determine this:

1. indexes
2. types
3. tables

(1) Indexes influence this decision when they are summarizing. Otherwise it is assumed they only need updates when the table AM signals TU_All. I think indexes need more influence over this, but that's for a later commit.

(2) Types don't influence this decision today. Their equality operators are not used, attributes are memcmp(). This is a requirement for index-only scans. I think it's insufficient for types like JSONB which have internal structure that can be extracted to form index keys, but that's for a later commit.

(3) Tables, really only heap today, own this decision in our code as it stands now. The heap informs the executor: TU_All, TU_None, TU_Summarizing (all, none, some). The choice between these three outcomes is the "HOT decision" in the heap_update() code. All indexes are updated when any indexed attribute is modified. No indexes are updated when there are no modifications to indexed attributes. If only attributes referenced by summarizing indexes were modified then all summarizing indexes are updated (even those without modified attributes). Of course if the newly updated tuple (even after being compressed/TOASTed) won't fit on the same page as the old one then the result is TU_All. This is where HEAD's logic is today.

Future AMs, or even heap, might be able to:
- chain updates across pages
- serve as the primary key index as well as tuple storage (IoT)

Indexes might be able to:
- identify that they only index portions of the key datum provided
- ask types if that portion of the data changed or not

Types might be able to:
- record how they've been indexed on a relation
- record during mutation on update if the changes intersect with what's provided to indexes

But none of that is in this patch now. The only change that might be useful is to avoid updating unmodified summarized indexes on a HOT/summarized update. That's a simple addition on top of this patch, but not in the attached patch.

> It seems reasonable to me that the executor is the right place for the
> first check, because it can be more precise. The motivating example
> here is a JSON document where one field is indexed and unrelated fields
> are being updated, in which case we can still do a HOT update because
> the indexed value isn't actually changing.

Yes, JSON is a good example of how a type plays a role in deciding which indexes need updates. That's where this thread started, expression indexes on JSONB that are unchanged during update should allow HOT updates when unchanged (IMO).

> As far as the patch itself, it seems like you're moving a lot of code
> out of heap_update() and into simple_heap_update() &
> heapam_tuple_update(). Can you explain why that's needed? Perhaps I
> just need to look closer.

I can see how this might be confusing, you're asking a good question. Why not just add the mix_attrs as an argument to the table AM update call and be done?

1. Bitmapsets that are NULL mean empty, so how would simple_heap_update() signal to heap_update() that it needs to determine the modified indexed attributes? We'd have to add a bool along with the mix_attrs Bitmapset to indicate: "we've not calculated the set yet, you need to do that."

2. After fetching the exclusive buffer lock there is the test `!ItemIdIsNormal(lp)` to cover the case where a simple_heap_update() the otid origin is the syscache, there is no pin or snapshot, and so there might be LP_* states other than LP_NORMAL due to concurrent pruning. This only happens when updating catalog tuples, so this logic need not be present at all in the heapam_tuple_update(). Yes, the if() branch will be fast (frequently predicted by the CPU) but this feels like logic specific to the update of catalog tuples.

3. HeapDetermineColumnsInfo() actually does more than find the modified indexed attributes, it also performs half of the check for the requirement to WAL log the replica identity attributes. The replacement function in the executor doesn't do this work, so that is coded into heapam_tuple_update() but not simple_heap_update(). The second half is in ExtractReplicaIdentity() that happens later in the heap_update() function after determining if HOT is a possibility or not.

I have moved these changes back into heap_update(), add the mix_attrs and mix_attrs_valid to see how things look, that's the attached patch.

>> b) all that HOT nonsense
>>
>> I feel that (a) has value even without (b), that removing a chunk of
>> work from within an exclusive buffer lock to outside that lock is a
>> Good Thing(TM) and could in this case result in more concurrency.
>
> Right. It would be nice to see some numbers here, but moving work
> outside of the buffer lock seems like a good idea.

No numbers yet.

>> To that end, present to you a single patch that *only* does (a), it
>> moves the logic of HeapDeterminColumnsInfo() into the executor and
>> doesn't change anything else.  Meaning that what goes HOT today
>> (without this patch), should continue to be HOT tomorrow (with this
>> patch) and nothing else.
>
> Why are there test diffs?

Removed test differences.

> Also, does this move us (slightly) closer to PHOT?

I'd argue that it does move the code closer to something WARM/PHOT-like meaning an ability to only update a subset of indexes on update; a change from "all/none/some" to "the required subset, and no more." That's a ways off.

> Regards,
> Jeff Davis

thanks again for looking Jeff,

-greg

Attachment Content-Type Size
v20260215-0001-Idenfity-modified-indexed-attributes-in-th.patch text/x-patch 32.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-02-15 20:59:09 Re: index prefetching
Previous Message Tom Lane 2026-02-15 19:56:07 Inconsistency in installation of syscache_info.h