| From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
|---|---|
| To: | Greg Burd <greg(at)burd(dot)me>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Expanding HOT updates for expression and partial indexes |
| Date: | 2026-02-16 19:36:36 |
| Message-ID: | fc17cbe85c24ff9d3cf43bfb37e6ca0d483fb2b4.camel@j-davis.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, 2026-02-15 at 15:39 -0500, Greg Burd wrote:
> (2) Types don't influence this decision today. Their equality
> operators are not used, attributes are memcmp(). This is a
> requirement for index-only scans. I think it's insufficient for
> types like JSONB which have internal structure that can be extracted
> to form index keys, but that's for a later commit.
That's an interesting path but would require more infrastructure, and
to justify going down that path we should look for other opportunities
to use that type infra beyond just HOT. Brainstorming: maybe something
in the planner can make use of intelligence around expressions that are
effectively setter/accessor methods on complex types? (Obviously work
for later.)
> I can see how this might be confusing, you're asking a good
> question. Why not just add the mix_attrs as an argument to the table
> AM update call and be done?
>
> 1. Bitmapsets that are NULL mean empty, so how would
> simple_heap_update() signal to heap_update() that it needs to
> determine the modified indexed attributes? We'd have to add a bool
> along with the mix_attrs Bitmapset to indicate: "we've not calculated
> the set yet, you need to do that."
Maybe an extra bool is not ideal, but it's better than moving code onto
the wrong side of an API boundary.
> 2. After fetching the exclusive buffer lock there is the test
> `!ItemIdIsNormal(lp)` to cover the case where a simple_heap_update()
> the otid origin is the syscache, there is no pin or snapshot, and so
> there might be LP_* states other than LP_NORMAL due to concurrent
> pruning. This only happens when updating catalog tuples, so this
> logic need not be present at all in the heapam_tuple_update(). Yes,
> the if() branch will be fast (frequently predicted by the CPU) but
> this feels like logic specific to the update of catalog tuples.
If convenient I'm fine with moving that branch out, but I think it
needs to be done with the buffer locked (right?), so heap_update()
looks like the right place for that test for now.
> 3. HeapDetermineColumnsInfo() actually does more than find the
> modified indexed attributes, it also performs half of the check for
> the requirement to WAL log the replica identity attributes. The
> replacement function in the executor doesn't do this work, so that is
> coded into heapam_tuple_update() but not simple_heap_update(). The
> second half is in ExtractReplicaIdentity() that happens later in the
> heap_update() function after determining if HOT is a possibility or
> not.
IIUC you are saying that the decision is too heap-specific to expose to
the executor. I think that's true today (as ExtractReplicaIdentity() is
in heapam.c), but perhaps that's not fundamental: TOAST is not heap-
specific, replica IDs are not heap-specific, and if you are WAL-logging
a replica identity key it seems like you need to know whether it's
external or not regardless of the AM.
I'm not asking for a change here, just trying to understand the API
boundaries.
> I have moved these changes back into heap_update(), add the mix_attrs
> and mix_attrs_valid to see how things look, that's the attached
> patch.
Thank you -- that's easier to understand.
Why does simple_heap_update() need to do the HeapDetermineColumnsInfo()
inside heap_update()? It seems like you're trying to avoid doing the
same work the executor is doing to determine the modified_attrs bitmap,
but either (a) the work is cheap; or (b) the work to make the bitmap is
expensive.
If (a), then just construct the correct bitmap in simple_heap_update()
and simplify the code. If (b), then optimizing the simple_heap_update()
case isn't good enough, we need to find ways of avoiding the work in
the most common cases in the executor, as well.
Regards,
Jeff Davis
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zsolt Parragi | 2026-02-16 19:40:09 | Re: [OAuth2] Infrastructure for tracking token expiry time |
| Previous Message | Andres Freund | 2026-02-16 19:16:28 | Re: Inconsistency in installation of syscache_info.h |