Quick Links

Re: Inval reliability, especially for inplace updates

From:	Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
To:	Noah Misch <noah(at)leadboat(dot)com>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ilyasov Ian <ianilyasov(at)outlook(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Inval reliability, especially for inplace updates
Date:	2025-07-31 16:53:20
Message-ID:	CA+renyU+LGLvCqS0=fHit-N1J-2=2_mPK97AQxvcfKm+F-DxJA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Ian Ilyasov and I reviewed this patch. We think it is ready to commit
to back branches.

The attached patch applies to REL_17_STABLE but not other stable
branches, so we assume Noah will adjust it as needed.

We were able to reproduce Alexander Lakhin's hang when we tested
against 0a0a0f2c59 (i.e. before the previous version was reverted),
although adding the delay was necessary. With this patch applied, we
don't see the hang (even with the delay).

We agree that the new assertions are a good idea to prevent similar
errors in the future.

We couldn't devise other ways to break this patch. Surely more
experienced hackers could be more creative, but nonetheless it's
reassuring that the patch's twin has been in v18devel since November.

We assume you will also un-revert 8e7e672cda ("WAL-log inplace update
before revealing it to other sessions.")? We didn't look closely at
that patch, but it seems like there are no known problems with it. It
was just reverted because it depends on this patch.

Since this is a backpatch, it doesn't seem right to give non-essential
feedback. Here are a few thoughts anyway. Consider them notes for
future work rather than reasons to delay backpatching or drift from
the patch on master.

Is there any way to add more testing around non-transactional
invalidations? It is a new "feature" but it is not really tested
anywhere. I don't think we could do this with regress tests, but
perhaps isolation tests would be suitable.

Some of the comments felt a bit compressed. They make sense in the
context of this fix, but reading them cold seems like it will be
challenging. For example this took a lot of thinking to follow:

* Construct shared cache inval if necessary. Because we pass a
tuple
* version without our own inplace changes or inplace changes
other
* sessions complete while we wait for locks, inplace update
mustn't
* change catcache lookup keys. But we aren't bothering with
index
* updates either, so that's true a fortiori.

Or this:

* WAL contains likely-unnecessary commit-time invals from the
* CacheInvalidateHeapTuple() call in heap_inplace_update().

Why likely-unnecessary? I know you explain it at that callsite, but
some hint might help here.

It's a bit surprising that wrongly leaving relhasindex=t is safe (for
example after BEGIN; CREATE INDEX; ROLLBACK;). I guess this column is
just to save us a lookup for tables with no index, and no harm is done
if we do the lookup needlessly but find no indexes. And vacuum can
repair it later. Still it's a little unnerving.

On Thu, Oct 31, 2024 at 09:20:52PM -0700, Noah Misch wrote:
> Here, one of the autovacuum workers had the guilty stack trace, appearing at
> the end of this message. heap_inplace_update_and_unlock() calls
> CacheInvalidateHeapTupleInplace() while holding BUFFER_LOCK_EXCLUSIVE on a
> buffer of pg_class. CacheInvalidateHeapTupleInplace() may call
> CatalogCacheInitializeCache(), which opens the cache's rel. If there's not a
> valid relcache entry for the catcache's rel, we scan pg_class to make a valid
> relcache entry. The ensuing hang makes sense.

Personally I never expected that catcache could depend on relcache,
since it seems lower-level. But it makes sense that you need a
relcache of pg_class at least, so their relationship is more
complicated than just layers.

I'm struggling to understand how your explanation incorporates
*concurrency*, since a self-deadlock only involves locks from one
backend. But the point is that a concurrent invalidation causes the
relcache entry to vanish, so that we need to rebuild it. (We can't get
this far without having built the relcache for pg_class once already.)

Specifically, we drop the table while autovacuum is updating its
statistics. But how is that possible? Don't both those things
exclusive-lock the row in pg_class? I must be misunderstanding.

> Tomorrow, I'll think more about fixes. Two that might work:
>
> 1. Call CacheInvalidateHeapTupleInplace() before locking the buffer. Each
> time we need to re-find the tuple, discard the previous try's inplace
> invals and redo CacheInvalidateHeapTupleInplace(). That's because
> concurrent activity may have changed cache key fields like relname.

We agree that choice 1 is a good approach.

PrepareToInvalidateCacheTuple(Relation relation,
HeapTuple tuple,
HeapTuple newtuple,
- void (*function) (int, uint32, Oid))
+ void (*function) (int, uint32, Oid, void *),
+ void *context)

It's a little odd that PrepareToInvalidateCacheTuple takes a callback
function when it only has one caller, so it always calls
RegisterCatcacheInvalidation. Is it just to avoid adding dependencies
to inval.c? But it already #includes catcache.h and contains lots of
knowledge about catcache specifics. Maybe originally
PrepareToInvalidateCacheTuple was built to take
RegisterRelcacheInvalidation as well? Is it worth still passing the
callback?

@@ -6511,6 +6544,7 @@ heap_inplace_unlock(Relation relation,
{
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
UnlockTuple(relation, &oldtup->t_self, InplaceUpdateTupleLock);
+ ForgetInplace_Inval();
}

Is this the right place to add this? We think on balance yes, but the
question crossed my mind: Clearing the invals seems like a separate
responsibility from unlocking the buffer & tuple. After this patch,
our only remaining caller of heap_inplace_unlock is
systable_inplace_update_cancel, so perhaps it should call
ForgetInplace_Inval itself? OTOH we like that putting it here
guarantees it gets called, as a complement to building the invals in
heap_inplace_lock.

Yours,

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

In response to

Re: Inval reliability, especially for inplace updates at 2025-05-10 15:14:23 from Noah Misch

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dagfinn Ilmari Mannsåker	2025-07-31 17:16:17	Re: Improve tab completion for various SET/RESET forms
Previous Message	Sami Imseih	2025-07-31 16:52:59	Re: Let plan_cache_mode to be a little less strict