Re: Expanding HOT updates for expression and partial indexes

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Greg Burd <greg(at)burd(dot)me>, Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expanding HOT updates for expression and partial indexes
Date: 2026-03-16 17:55:26
Message-ID: 7b8681a578b5fe77103948eeb7b5cdd80fabad5d.camel@j-davis.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2026-03-16 at 12:23 -0400, Greg Burd wrote:
> Hello Jeff, thanks for taking a look! :)

Hi, thank you for working on this problem!

> > Why do extra work in ExecBRUpdateTriggers() to eliminate the false
> > negative case if we don't rely on it anyway? If we do need to rely
> > on
> > it in subsequent patches, then we need to be sure, right?
>
> Later commits do currently rely on it, ExecUpdateModifiedIdxAttrs()
> uses it in the next commit (0003) to avoid reviewing indexed
> attributes that could not have possibly changed.

OK. The first half of the commit message for 0002 is slightly confusing
because it's referring to pre-existing behavior, behavior changed by
the commit, and also future work. It might help to clarify the tenses
like:

- Previously, ExecGetAllUpdatedCols() had gaps ..., but not a real bug
because ...
- This commit closes those gaps by updating ri_extraUpdatedCols in
ExecBRUpdateTriggers(), making ExecGetAllUpdatedCols() reliable.
- We know there are no other gaps because ...
- Useful to fix because later work will rely on it for [very brief
reason]

>   Imagine a table with a lot of indexes where updates only modify one
> or two at a time.  Why are we testing indexed attributes for changes
> in HeapDeterminColumnsInfo() that couldn't have changed?  The answer
> is that a) HeapDeterminColumnsInfo() lives in heap, not the executor
> (see patch 0003) so it has no ability to call
> ExecGetAllUpdatedCols(), and b) the set returned by
> ExecGetAllUpdatedCols() is sometimes incomplete.

That's helpful, thank you.

> What do we "need to be sure" of?  That ExecGetAllUpdatedCols() not
> really contains all attributes that its name implies?  I think it now
> does that after 0002, do you disagree?

I don't disagree, but I think we need some kind statement that we
believe that it's true, and a brief explanation why. (I don't have much
of an opinion about whether it's in this thread, the commit message, or
the code.)

>
> I think it is a new guarantee that was implied before now but not
> required until 0003.  I think this change reduces overhead and helps
> to avoid some future security feature that depends on
> ExecGetAllUpdatedCols() to provide that guarantee.
>
> Does that make sense?

A subtlety here is that perhaps ExecGetAllUpdatedCols() already *was*
correct, and it just meant something different than we thought: the
*targeted* columns of an update, instead of the actually-updated
values.

If so we should think about whether that distinction should be
preserved. For instance, column filtering for triggers should be based
on the targeted columns (rather than actually-updated values) because,
semantically, it should still fire even for a no-op update. Perhaps
similar for choosing the lock mode?

>
Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2026-03-16 18:00:00 Re: pgcrypto/des tests fail on riscv64 due to clang's code generation anomaly
Previous Message Mahendra Singh Thalor 2026-03-16 17:55:22 Re: Adding REPACK [concurrently]