| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
| Cc: | Sami Imseih <samimseih(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com> |
| Subject: | Re: Flush some statistics within running transactions |
| Date: | 2026-01-22 00:02:10 |
| Message-ID: | aXFpAisDKy6g_cKx@paquier.xyz |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Jan 21, 2026 at 10:34:09AM +0000, Bertrand Drouvot wrote:
> No, 0003 also changes the flush mode for the database KIND. All the fields that
> I mentioned are inherited from relations stats and are flushed only at transaction
> boundaries (so they don't appear in pg_stat_database until the transaction
> finishes). Does that make sense? (if the database kind is not switched to
> flush any time then none would appear while the transaction is in progress, even
> the ones inherited from relations stats).
>
> PFA v3, also taking care of Zsolt's comment (thanks!) done up-thread.
While reading through 0001, I got to question on which properties
and/or assumptions of a stats kind one has to rely on to decide to
what flush_mode should be set. To put is simpler, why don't we just
do a periodic pgstat_report_stat(false) call that would flush all the
stats for all stats kinds based on the new timeout registered,
expanding a bit the flush we currently do when idle in
ProcessInterrupts()? It seems that one point of contention should be
that we should be careful with entries in the shmem hash table that
have been created in a transactional way, but we may already flush
them while we are in a transaction state, no? Are there any fields in
a stats kind that we do may not want to flush? If yes, it sounds to
me that it would be better to document these in the structures to
explain the reason why a flush mode is chosen over the other.
I am also not convinced that we have to be that aggressive with these
extra flushes. The target is long-running analytical queries, that
could take minutes or even hours. Using the same value as
PGSTAT_IDLE_INTERVAL (10s), perhaps renaming the value while on it,
would be a more natural fit. A 1s vs 10s report interval does not
really matter for long analytical queries, where I'd imagine data
being picked up on at least a 30s interval, at the shortest. Of
course, one may want to get a more "live" representation of the data
with more aggressive flushes, but is that really helpful for
long-running queries to have more granularity, stressing more the
shmem state?
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-01-22 00:20:07 | Re: Add WALRCV_CONNECTING state to walreceiver |
| Previous Message | Masahiko Sawada | 2026-01-21 23:49:14 | Re: pg_upgrade: optimize replication slot caught-up check |