Re: Flush some statistics within running transactions

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
Subject: Re: Flush some statistics within running transactions
Date: 2026-01-22 01:56:48
Message-ID: CAHGQGwGBkPEK=NpLuk1HSmccu6OA2FYqkkGcA-Hb3WLs6dX=cg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 22, 2026 at 10:41 AM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
> > > No, 0003 also changes the flush mode for the database KIND. All the fields that
> > > I mentioned are inherited from relations stats and are flushed only at transaction
> > > boundaries (so they don't appear in pg_stat_database until the transaction
> > > finishes). Does that make sense? (if the database kind is not switched to
> > > flush any time then none would appear while the transaction is in progress, even
> > > the ones inherited from relations stats).
> > >
> > > PFA v3, also taking care of Zsolt's comment (thanks!) done up-thread.
> >
> > While reading through 0001, I got to question on which properties
> > and/or assumptions of a stats kind one has to rely on to decide to
> > what flush_mode should be set. To put is simpler, why don't we just
> > do a periodic pgstat_report_stat(false) call that would flush all the
> > stats for all stats kinds based on the new timeout registered,
> > expanding a bit the flush we currently do when idle in
> > ProcessInterrupts()?
>
> There are some important cases in which we would want to
> distinguish between a "transaction boundary" flush vs an
> "anytime" flush.
>
> For example, xact_commit/rollback. I would want those
> fields to be in sync with tuples_inserted/updated/deleted
> to allow for accurate calculations like number of inserts
> per commit, etc.
>
> Another one would be n_mod_since_analyze, That should
> only be updated after commit (or not after rollback). Otherwise,
> it may throw autovanalyze threshold calculations way off. Same
> for n_dead_tup and autovacuum.
>
> > I am also not convinced that we have to be that aggressive with these
> > extra flushes. The target is long-running analytical queries, that
> > could take minutes or even hours. Using the same value as
> > PGSTAT_IDLE_INTERVAL (10s),
>
> PGSTAT_IDLE_INTERVAL is flushing an idle backend every 10 seconds
> IIUC. So this value only applies when outside of a transaction.
>
> > A 1s vs 10s report interval does not really matter for long analytical queries.
>
> Sure, Bertrand mentioned early in the thread that the anytime flushes
> could be made configurable. Perhaps that is a good idea where we can
> default with something large like 10s intervals for anytime flushes, but allow
> the user to configure a more frequent flushes ( although I would think
> that 1 sec is the minimum we should allow ).

+1 on adding an option to control the interval. With a fixed interval
(for example, 1s), log_lock_waits messages could be emitted that frequently,
which may be annoying for some users.

Of course, it would be even better if these periodic wakeups did not trigger
log_lock_waits messages at all, though.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-01-22 01:59:45 Re: Having problems generating a code coverage report
Previous Message Andres Freund 2026-01-22 01:56:41 Re: Likely undefined behavior with some flexible arrays