Re: Expanding HOT updates for expression and partial indexes

From: "Greg Burd" <greg(at)burd(dot)me>
To: "Burd, Greg" <gregburd(at)amazon(dot)com>, "Matthias van de Meent" <boekewurm+postgres(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expanding HOT updates for expression and partial indexes
Date: 2025-07-02 18:10:19
Message-ID: 97f0aa72-f172-4673-8b04-533f022c3149@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm working again on expanding the conditions under which HOT updates are allowable, it is still a work-in-progress at this point. It has been a while since my last update to this patch set so I'll refresh everyone's memories (including myself) with an overview. Apologies in advance for a long email...

The goal is to allow HOT updates under two new conditions:
* when an indexed expression has not changed
* when possible for a partial index

Expression Indexes and HOT updates
==========================================================
Indexes on expressions such as:
CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
CREATE INDEX names ON people((docs->>'user'));

These are not currently candidates for HOT updates because the attributes they reference are added to the hotblockingattrs bitmapset that is compared for overlap with the modified_attrs bitmap in heap_update() as created by HeapDetermineColumnsInfo(). Logically this exclusion makes sense because expressions used to form indexes can only contain references to functions that are IMMUTABLE. This then allows for a simple and efficient method for determining if the update is a candidate for the HOT path, a quick check for overlap of two bitmapsets and you have an answer.

But there is a common case that is overlooked. The third example above is an expression index that references a JSONB attribute 'docs' field 'name'. Imagine a simple bit of JSON stored in the 'docs' JSONB column changed by an UPDATE from:
{ "user": "scott", "password": "tiger" }
to:
{ "user": "scott", "password": "$ecret" }
this will not use the HOT path in today's code because the attribute for 'docs' will be referenced in the UPDATE statement and that attribute's content did change so it will be in the modified_attrs which will overlap with the hotblockingattrs set. The result is that it is not possible to use the HOT update path in heap_update() if there is an indexed JSONB column in that statement. This has a huge impact on performance and bloat. It's not hard to see that the index doesn't require an update, the portion of the document used to form the index tuple isn't changing.

This patch addressed this problem by evaluating the expression on the index using the current and new tuple and then comparing them. If there is no change to the index tuple then the HOT path is still an option for the update whereas before this was not the case.

There is at least one major challenge to this approach left to solve, it invokes UDFs while holding BUFFER_LOCK_EXCLUSIVE which we learn from heap_attr_equals() "we cannot safely invoke user-defined functions while holding exclusive buffer lock."

So, for performance an safety reasons it makes sense to try to limit the expressions that can be evaluated to those that can't self-deadlock trying to pin the same buffer and are roughly constant time and fast so we don't hold that lock for very long. Additionally it would be nice to ensure that we really have to do the evaluation in the first place.

I looked for ways to limit the types of expressions to evaluate so as to only take this approach when necessary, but I wasn't able to identify a good way forward. The problem is that it is possible to author a UDF similar to the json/jsonb getter functions json_extract_path() that extract a portion of a datum that is then used when creating an index. There is currently no demarcation on functions similar to "IMMUTABLE" (for example "EXTRACTOR") that would indicate that's what the UDF does nor does it strictly fit within the category of volatility as I don't think of this as information for the optimizer to use. Ideally you'd only want to evaluate the expression when any function in the expression is known to extract a portion of the datum. Were that the case you could require for HOT updates in the presence of expression indexes that at least one function in the expression be an "EXTRACTOR". I might still do this depending on reaction to it here, I'm seeking ideas.

Another idea was to change up the strategy a bit and not check expressions at all, but rather find a method to keep the attribute out of the modified_attrs set to begin with. This would mean that HeapDeterminColumnsInfo() would have to change, any maybe that's a good idea to try out too. It could even morph into something that can test equality using type-specific equality tests supporting custom types and invoke index access method-specific path extraction functions. In some ways this feels more "correct" to me in that HeapDeterminColumnsInfo()'s goal is to reduce the set of attributes in play from the interesting set to the modified set and if what is interesting is an index on a field within a document that isn't modified then it shouldn't be in the set. This comes with challenges related to that comment mentioned earlier in heap_attr_equals() and the fact that index access methods don't have a way to supply their equality op or provide a way to call a path operator should one exist. But the benefit would be that we expand HOT updates a bit further than my initial goals.

I'm not wedded to the use of a reloption to disable expression checks, nor do I love the name I choose ("expression_checks"). I'm interested to hear if I should simple remove this.

Another benefit to changing HeapDeterminColumnsInfo() is that it is what provides information to the PHOT patch during heap pruning to record which attributes were changed. As it stands now, this patch and the PHOT patch when combined result in some undesirable index scan results.

Partial Indexes and HOT updates
==========================================================

Partial indexes are a problem when it comes to HOT updates because they are part of the modified_attrs set returned by HeapDeterminColumnsInfo() even in cases where the value being set in the UPDATE does not satisfy the partial index predicate. Take for example:

CREATE TABLE example (a int, b int);
CREATE INDEX idx_a ON example(a);
CREATE INDEX idx_b ON example(b) WHERE b > 100;
INSERT INTO example (a, b) VALUES (1, 50);
UPDATE example SET b = 60 WHERE a = 1;

What happened? Was the update HOT? No. Why? Because the predicate isn't taken into account at the time the decision for HOT (or not) is made, that happens later during ExecInsertIndexTuples(). Today this update would trigger new index entries for both idx_a and idx_b when neither were required.

This patch changes this behavior by evaluating the predicate for the partial index earlier during heap_update(). When both the existing and updated tuples fall outside of the predicate the HOT update path is still an option. To limit scope creep I've not implemented the case when the existing was within the index's predicate but now the updated falls outside the predicate because that would require adding an optional delete function to the index access method (something that I think would be generally useful at some point). When that happens this patch does not allow a HOT update forcing the new tuple in the heap to not be redirected so later on index scan the CTID from the index will be ignored.

Summary
==========================================================

Attached find v17 of this patch rebased to this morning's (EDT) HEAD passing make world and formatted.

any and all feedback welcome.

-greg

Attachment Content-Type Size
v17-0001-Expand-HOT-update-path-to-include-expression-and.patch text/x-patch 134.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John H 2025-07-02 18:21:54 Re: Making pg_rewind faster
Previous Message Tom Lane 2025-07-02 17:59:17 Re: Cross-type index comparison support in contrib/btree_gin