| From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
|---|---|
| To: | Sami Imseih <samimseih(at)gmail(dot)com> |
| Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com> |
| Subject: | Re: Flush some statistics within running transactions |
| Date: | 2026-02-24 12:01:30 |
| Message-ID: | aZ2TGo5bkDvYOfIw@ip-10-97-1-34.eu-west-3.compute.internal |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Mon, Feb 23, 2026 at 05:47:22PM -0600, Sami Imseih wrote:
> > > For variable-length statistics, perhaps we can do things a bit
> > > differently than what is currently proposed. 0005 requires
> > > a relation anytime stat update to call
> > > pgstat_schedule_anytime_update(). This is done this way because
> > > it allows long-running queries to update their stats every
> > > stats_flush_interval using a timeout.
> > >
> > > But maybe what we should be doing for variable-numbered stats is
> > > to schedule an anytime update whenever a "transaction goes idle".
> >
> > I think the logic for fixed stats and variable stats should be the same. If
> > not we could observe discrepancies: for example a long running select could
> > genereate reads/hits IO visible in pg_stat_io but tuples_returned, tuples_fetched,
> > blocks_fetched or blocks_hit would not be updated until the session goes idle.
>
> After having more time to think about this, I believe it can be much simpler.
> As soon as we enter an idle-in-transaction (aborted) state, we can simply
> schedule an anytime update. This ensures that a flush is scheduled whenever
> the fixed stats trigger one, which will likely be the most common reason
> (e.g., I/O stats, WAL stats, etc.). To cover the cases where fixed stats
> do not schedule a flush, we can also schedule one as soon as a transaction
> goes idle.
>
> In my mind, this makes this whole flushing scheduling behavior easy to reason
> about, and if we introduce future anytime stats anywhere, we are not required
> to schedule a flush for each individual field. The flush callback will of course
> still need to decide what to flush anytime or at the transaction boundary.
>
> What do you think?
My understanding is that (correct me if I'm wrong):
- fixed stats would still be designed the way it is in v11
- variable stats would not need the pgstat_schedule_anytime_update() calls in
various places. The flush would be done/schedule when the session goes idle.
Then I agree that that looks ok and that:
> This ensures that a flush is scheduled whenever
> the fixed stats trigger one, which will likely be the most common reason
> (e.g., I/O stats, WAL stats, etc.)
Though I don't think that adresses Michael's concern: "main worries are
mainly around 1), I guess, with the new SIGALRM handler requirements for all
auxiliary processes" in [1].
Regards,
[1]: https://postgr.es/m/aZznT84Ssh8PywcH%40paquier.xyz
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2026-02-24 12:33:12 | Re: More speedups for tuple deformation |
| Previous Message | Rahila Syed | 2026-02-24 11:57:17 | Re: Enhancing Memory Context Statistics Reporting |