Re: Flush some statistics within running transactions

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
Subject: Re: Flush some statistics within running transactions
Date: 2026-01-31 17:16:03
Message-ID: CAA5RZ0tvxZTZ7Xwm7FjWxAY_OphCq7tmpCaLgas0qrh+J86pRg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> This makes things confusing because instead of just relying on
> dlist_is_empty(&pgStatPending) to check if we need to flush anything,
> the responsibility now moves to the callback, which now also has to
> account for all the ANYTIME fields.
>
> The way I can think about making this better is to somehow track if
> we have ANYTIME data that needs to be flushed a different way
> ( maybe a second pgStatPending list for anytime stats ?? )

Thinking about this a bit more, it seems the FLUSH_MIXED may
not be needed. Instead can we just introduce a new global variable
called pgstat_report_variable_anytime which acts like
pgstat_report_fixed, except it's set to true whenever we update
anytime variable-numbered stats

```
#define pgstat_count_heap_scan(rel)
\
do {
\
pgstat_report_variable_anytime = true;
\
if (pgstat_should_count_relation(rel))
\
(rel)->pgstat_info->counts.numscans++;
\
} while (0)

````

and reset whenever we call pgstat_report_anytime_stat, like this:

```
void
pgstat_report_anytime_stat(bool force)
{
.....
.........
if (!pgstat_report_variable_anytime && !pgstat_report_fixed)
return;

/* Flush stats outside of transaction boundary */
if (pgstat_report_variable_anytime)
pgstat_flush_pending_entries(nowait, true);
....
........
pgstat_report_variable_anytime = false;
}
```

and then inside pgstat_flush_pending_entries, we just look
for kind_info->flush_mode == FLUSH_ANYTIME to flush.

```
/* flush the stats (with the appropriate callback), if possible */
if (anytime_only &&
kind_info->flush_mode == FLUSH_ANYTIME &&
kind_info->flush_anytime_cb != NULL)
{
/* Partial flush of non-transactional fields only */
did_flush = kind_info->flush_anytime_cb(entry_ref, nowait);
is_partial_flush = true;
}
````

With this approach, we will not enter pgstat_flush_pending_entries
inside pgstat_report_anytime_stat unless we have anytime
variable stats to report.

Also, the anytime flush callback does not need to check if there are
any variable-numbered stats to flush. This will not be needed as
it is in v4-0004

```
+ /*
+ * Check if there are any non-transactional stats to flush. Avoid
+ * unnecessarily locking the entry if nothing accumulated.
+ */
+ if (!(lstats->counts.numscans > 0 ||
+ lstats->counts.tuples_returned > 0 ||
+ lstats->counts.tuples_fetched > 0 ||
+ lstats->counts.blocks_fetched > 0 ||
+ lstats->counts.blocks_hit > 0))
+ return true;
```

This feels like an easier approach to reason about and we don't
need to add a third flush mode.

Thoughts?

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2026-01-31 17:20:14 Re: using index to speedup add not null constraints to a table
Previous Message David E. Wheeler 2026-01-31 16:35:28 Re: ABI Compliance Checker GSoC Project