| From: | "Greg Burd" <greg(at)burd(dot)me> |
|---|---|
| To: | "Jeff Davis" <pgsql(at)j-davis(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Expanding HOT updates for expression and partial indexes |
| Date: | 2026-02-19 20:32:25 |
| Message-ID: | e5ce43c9-9a6d-4f8f-a01f-0ce1a99e3385@app.fastmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
This is an updated version of the last patch with a few fixes and a layer on top of it that tries to cleanup heap_update().
v20260219:
0001 - Not much changed in this patch, some clean up and fixed a few mistakes. This patch passes all tests without any need to modify them. I've added ExecCompareSlotAttrs() helper function.
0002 - As before in v29 I've split off the top half of heap_update() and moved that into heapam_tuple_update() and simple_heap_update(). I've created helper functions for different steps in these early stages: HeapUpdateHotAllowable(), HeapUpdateRequiresReplicaId(), HeapUpdateDetermineLockmode(). This allows for a cleaner set of bitmaps and logic (as I read it) on the heapam_tuple_update() path. I reuse these helper functions in simple_heap_update() when possible, even trimming up HeapDeterminColumnsInfo() a bit so as to reuse HeapUpdateRequiresReplicaId().
I've tested with code that validates in heapam_tuple_update() that the modified attr bitmaps are identical:
{
Bitmapset *hot_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_INDEXED);
Bitmapset *id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
Bitmapset *hdci_attrs = HeapDetermineColumnsInfo(relation, hot_attrs,
&oldtup, tuple);
Assert(bms_equal(mix_attrs, hdci_attrs));
bms_free(hot_attrs);
bms_free(id_attrs);
bms_free(hdci_attrs);
}
Despite that passing two tests became non-deterministic without an "ORDER BY" on a select. You'll see those in generated_virtual.sql and updatable_views.sql in the second patch. I don't know yet why that happened, but the results are otherwise identical.
I will continue to performance test. Most things I've tried differ by less than 0.5% before/after this patch. Some operations where multiple rows are matched in an UPDATE and there are concurrent reads are faster (10-20%), I need to dig into this. I'm expanding the tests I'm running to try to find any cases where holding the buffer lock for less time could possibly result in higher TPS. The goals for $subject were faster TPS, but more importantly to lower index bloat and help shorten vacuum times.
Next up I'll work on re-introducing some of the other work from $subject and other changes in v29 and earlier patch sets.
* avoid the need for index_unchanged_by_update()
* re-add the new index AM function, "amcomparedatums()" or similar
* add a flag for types indicating that they have "sub-attributes" (JSONB, XML, ARRAY)
* store in pg_attribute during CREATE INDEX when types with sub-attributes are in expressions the relation and some representation of what the sub-attribute is
* update JSONB functions that mutate content to use the pg_attribute information and record if there were changes to indexed "sub-attributes" or not
* use that recorded information later in the executor to identify if the indexed sub-attribute changed or not opening the door for $subject without evaluating the before/after expressions
* re-examine partial indexes as well
* consider how one might layer a PHOT/WARM-thingie on this... (in a different thread in the future, like next year)
best.
-greg
| Attachment | Content-Type | Size |
|---|---|---|
| v20260219-0001-Idenfity-modified-indexed-attributes-in-th.patch | text/x-patch | 30.5 KB |
| v20260219-0002-Refactor-heap_update-and-move-attribute-de.patch | text/x-patch | 63.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2026-02-19 20:40:59 | Re: assume availability of "inline" keyword |
| Previous Message | Andrew Dunstan | 2026-02-19 20:00:03 | Re: Non-text mode for pg_dumpall |