Quick Links

Re: Report oldest xmin source when autovacuum cannot remove tuples

From:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To:	shinya11(dot)kato(at)gmail(dot)com
Cc:	qiuwenhuifx(at)gmail(dot)com, samimseih(at)gmail(dot)com, japinli(at)hotmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Report oldest xmin source when autovacuum cannot remove tuples
Date:	2026-06-29 08:22:07
Message-ID:	20260629.172207.542761264338459554.horikyota.ntt@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello,

I looked at this a bit further and had several comments.

> if (kind != VISHORIZON_SHARED &&
> + proc->databaseId != MyDatabaseId &&
> + MyDatabaseId != InvalidOid &&
> + !(statusFlags & PROC_AFFECTS_ALL_HORIZONS) &&
> + !in_recovery)
> + continue;

First, although the comment says this mirrors the existing logic, it
also adds extra conditions. I think it would be clearer to separate
the common part from the function-specific part, so that the former
remains almost a direct copy of the existing logic and the latter is
handled separately.

Related to that, I am worried that future changes to
ComputeXidHorizons() could easily leave GetXidHorizonBlockers() out of
sync. Would it be possible to factor the common decision logic out
into a suitably named helper that both functions could use?

Second, as far as I can tell, in_recovery is not a per-process
condition, but rather controls whether the other filtering conditions
are applied at all. Flattening those two concerns into a single
condition makes the logic rather difficult to follow, particularly
because the underlying filtering rules are already fairly involved. As
written, it appears that recovery mode bypasses all of the other
filtering conditions, so all processes are considered by the
subsequent checks. Is that really the intended behavior?

In addition, I am also not sure the recovery case is handled
sufficiently here. As far as I understand, ComputeXidHorizons() does
not rely only on the proc array during recovery; it also considers the
oldest xmin from KnownAssignedXids. If the horizon is determined by
that value, GetXidHorizonBlockers() may not be able to find a matching
PGPROC entry. If this function is intended to explain the horizon
computed by ComputeXidHorizons(), I think the recovery case needs
either matching handling or an explicit comment explaining why it is
safe to ignore that source.

Also, RecoveryInProgress() can change from true to false while the
function is running. So I think it would be useful to explain
somewhere why it is safe to keep using the saved value. I do not see
such an explanation in ComputeXidHorizons() either, but this code made
me think about it.

> + if (TransactionIdEquals(proc_xid, horizon))
> + {
> + dst = &result[count++];
....
> + dst->type = XHB_XMIN_ACTIVE_TRANSACTION;
> + }
> +
> + if (dst)
> + {

This may just be a matter of taste, but I wonder whether this part
could be written a bit more clearly. For example, the first
if/else-if chain could determine only the blocker type, ending with an
"else continue". Then the common work, including allocating dst, could
be done afterwards in one place. I think that would make the flow a
bit easier to follow.

One more thought about the allocation strategy here. The fixed-size
"name" field seems to dominate the size of XidHorizonBlocker, but the
name is filled only after releasing ProcArrayLock. So I wonder whether
it needs to be embedded in every entry collected under the lock.

If "name" were stored separately, for example as a char * allocated
only when needed, the per-entry size would be much smaller. Then a
single array of XidHorizonBlocker entries preallocated before taking
ProcArrayLock would be less concerning. For example, with 1000
entries, an entry of around 24 bytes would be about 24KB, and the
strings would consume space only for entries that actually need one.

Another possible approach would be to collect only the lock-protected
fields in a small temporary array while holding ProcArrayLock, and
then build the full result array after releasing the lock.

> + active_proc = s->active_proc;
> + invalidated = s->data.invalidated != RS_INVAL_NONE;
> + SpinLockRelease(&s->mutex);
> +
> + /* Invalidated slots no longer hold back the horizon. */
> + if (invalidated)
> + continue;

This is a very minor point, but I wonder whether it would be a bit
cleaner to copy s->data.invalidated under the spinlock and perform the
comparison afterwards, keeping the locked section limited to copying
shared fields. I know there are other places that cache the boolean
result directly, but unless the boolean value is going to be reused, I
think comparing it afterwards reads a little more naturally.

Regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Re: Report oldest xmin source when autovacuum cannot remove tuples at 2026-06-27 08:08:02 from Shinya Kato

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2026-06-29 08:23:41	Re: statatt_build_stavalues->LOCAL_FCINFO wrong number
Previous Message	Corey Huinker	2026-06-29 08:14:11	Re: use of SPI by postgresImportForeignStatistics