Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-02-04 08:54:26
Message-ID: CAAAe_zC95wT9EdBu_nBn6mWVB7-xNNuJ4N8wPr0LrWJBdwmxiQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tatsuo,

BTW, I have a question regarding NFA related objects initialization in
> nodeWindowAgg.c.
>
> In release_partition():
> /* Reset NFA state for new partition */
> winstate->nfaContext = NULL;
> winstate->nfaContextTail = NULL;
> winstate->nfaContextFree = NULL;
> winstate->nfaStateFree = NULL;
> winstate->nfaLastProcessedRow = -1;
> winstate->nfaStatesActive = 0;
> winstate->nfaContextsActive = 0;
>
> While in ExecInitWindowAgg():
> /* Initialize NFA free lists for row pattern matching */
> winstate->nfaContext = NULL;
> winstate->nfaContextTail = NULL;
> winstate->nfaContextFree = NULL;
> winstate->nfaStateFree = NULL;
> winstate->nfaLastProcessedRow = -1;
>
> So the latter does not initialize winstate->nfaStatesActive and
> winstate->nfaContextsActive. Can you please explain why?
>

You're absolutely right to notice this inconsistency. Let me explain:

## Why ExecInitWindowAgg() omits these two fields

In ExecInitWindowAgg(), the WindowAggState structure is created via
makeNode(WindowAggState), which internally uses palloc0(). This
zero-initializes all memory, so:

- All pointers → NULL (0)
- nfaStatesActive → 0
- nfaContextsActive → 0
- All other statistics counters → 0

Technically, only nfaLastProcessedRow = -1 truly needs explicit
initialization since it requires a non-zero value. The nfaContext = NULL
assignments are redundant but follow PostgreSQL's convention of
explicitly setting pointers to NULL for code clarity and documentation.

The pattern in ExecInitWindowAgg() follows PostgreSQL's common practice:
- Pointers: explicitly initialized to NULL (for readability)
- Counters: implicitly zero-initialized via palloc0() (intentionally
omitted)

This same pattern appears throughout the function for other fields too.
For example, framehead_slot and frametail_slot are explicitly set to NULL,
while framehead_valid and frametail_valid are left zero-initialized.

## What nfaStatesActive and nfaContextsActive do

These two fields serve as internal counters for tracking active NFA
resources:

nfaStatesActive: Tracks the current number of active NFA states in memory.
- Incremented in nfa_state_alloc() when allocating a new state
- Decremented in nfa_state_free() when returning a state to the free list
- Used to calculate nfaStatesMax (peak usage) for EXPLAIN ANALYZE

nfaContextsActive: Tracks the current number of active NFA contexts.
- Incremented in nfa_context_alloc() when creating a new context
- Decremented in nfa_context_free() when recycling a context
- Used to calculate nfaContextsMax (peak usage) for EXPLAIN ANALYZE

Both counters start at 0 and are updated during NFA execution to track
resource usage for performance monitoring.

## Why release_partition() must explicitly reset them

release_partition() is called when moving to a new partition and must
reset per-partition state. Unlike ExecInitWindowAgg() which creates a
fresh structure, release_partition() reuses an existing structure that
contains values from the previous partition.

The root cause of the reset requirement is that release_partition()
calls MemoryContextReset(winstate->partcontext), which frees all
partition-local memory. This includes:
- All NFA context structures pointed to by nfaContext
- All NFA state nodes in the free lists (nfaContextFree, nfaStateFree)
- All other partition-specific data

As a consequence of this memory reset:
1. The NFA pointers (nfaContext, nfaContextFree, etc.) must be set to
NULL to avoid dangling pointers
2. The active resource counters (nfaStatesActive, nfaContextsActive)
must be reset to 0 since all the resources they were counting have
been freed

Example scenario:
Partition 1 processing:
- nfaStatesActive increments one-by-one as states are allocated,
decrements as they are freed, fluctuates during execution,
ends at some non-zero value (e.g., 3)
- nfaContextsActive similarly fluctuates, ends at some value (e.g., 2)

Partition 2 starts → release_partition() called:
- MemoryContextReset(partcontext) frees all NFA structures
- nfaContext, nfaContextFree, etc. → must set to NULL
- nfaStatesActive: 3 → must reset to 0
- nfaContextsActive: 2 → must reset to 0

These counters track instantaneous active resources within a partition,
so they must be reset to 0 when starting a new partition. In contrast,
cumulative statistics like nfaStatesMax and nfaStatesTotalCreated are
NOT reset because they track overall query-wide statistics across all
partitions.

## Why only these two counters?

You might wonder why I don't propose initializing all the other NFA
statistics fields as well. ExecInitWindowAgg() currently omits explicit
initialization for many statistics fields:

Per-partition internal counters (reset at partition boundaries):
- nfaStatesActive = 0; (omitted)
- nfaContextsActive = 0; (omitted)

Query-wide cumulative statistics (never reset):
- nfaStatesMax = 0; (omitted)
- nfaStatesTotalCreated = 0; (omitted)
- nfaStatesMerged = 0; (omitted)
- nfaContextsMax = 0; (omitted)
- nfaContextsTotalCreated = 0; (omitted)
- nfaContextsAbsorbed = 0; (omitted)
- nfaContextsSkipped = 0; (omitted)
- nfaContextsPruned = 0; (omitted)
- nfaMatchesSucceeded = 0; (omitted)
- nfaMatchesFailed = 0; (omitted)
- nfaMatchLen = {0, 0, 0}; (omitted)
- nfaFailLen = {0, 0, 0}; (omitted)
- nfaAbsorbedLen = {0, 0, 0}; (omitted)
- nfaSkippedLen = {0, 0, 0}; (omitted)

The key distinction is that nfaStatesActive and nfaContextsActive are
the ONLY statistics that get reset in release_partition(). All the other
statistics are cumulative across the entire query and should never be
reset between partitions.

For example, nfaStatesMax tracks the peak number of active states across
ALL partitions, not just within one partition. Similarly,
nfaStatesTotalCreated accumulates the total count across all partitions.
These cumulative statistics start at 0 (via palloc0) and only increase,
never getting reset.

Therefore, I believe it only makes sense to add explicit initialization
for nfaStatesActive and nfaContextsActive, since these are the only two
that release_partition() also explicitly resets. Adding explicit
initialization for all the cumulative statistics would be misleading,
as it might suggest they get reset somewhere (which they don't).

## Proposal for consistency

For consistency with release_partition(), I can add just the two
per-partition counter initializations to ExecInitWindowAgg():

/* Initialize NFA free lists for row pattern matching */
winstate->nfaContext = NULL;
winstate->nfaContextTail = NULL;
winstate->nfaContextFree = NULL;
winstate->nfaStateFree = NULL;
winstate->nfaLastProcessedRow = -1;
winstate->nfaStatesActive = 0; // Add this
winstate->nfaContextsActive = 0; // Add this

This would make both ExecInitWindowAgg() and release_partition()
initialize the same set of per-partition NFA fields, making the code
more uniform and easier to review. The cumulative statistics fields
remain omitted in both functions, which correctly reflects that they
are never reset.

Functionally, this change makes no difference since palloc0() already
zeros these fields, but it improves code consistency and makes the
per-partition vs. query-wide distinction clearer.

Would you like me to include this change in the next patch?

Best regards,
Henson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2026-02-04 09:00:18 Re: Re[2]: [PATCH] Add last_executed timestamp to pg_stat_statements
Previous Message Chao Li 2026-02-04 08:23:16 Re: Add backendType to PGPROC, replacing isRegularBackend