| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, lukas(at)fittl(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Improve pg_stat_statements scalability |
| Date: | 2026-06-12 23:21:42 |
| Message-ID: | CAA5RZ0s5z2=8fZ3FMG9fs32JctuLX-WV4jJ5X_w04MMihw9Q-A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
Sorry for the delay, and thank you for the comments!
> On Wed, Jun 03, 2026 at 05:10:39PM +0900, Kyotaro Horiguchi wrote:
> > One alternative would be to introduce a separate callback for anytime
> > flushes. However, since transaction-boundary flushes ultimately need
> > to flush everything anyway, it seems to me that passing a context flag
> > would be sufficient.
>
> Nah. I am not really in favor of an extra callback. I feel that this
> could lead to more mistakes when implementing new stats kinds, so
> doing your approach of using a value based on the context of the flush
> works for me.
I will proceed with this approach.
> > I tend to agree with the single-entry-point approach.
> >
> > The basic operation is common; the anytime case differs only in the
> > context in which the flush is requested. The flush-timing logic in
> > pgstat_report_stat() may become slightly more complicated with anytime
> > flushes, but after that point I think the existing path should
> > continue to work with only minor adjustments, by carrying an explicit
> > context value down to the callbacks.
>
> Yeah, same feeling here. That would be the way I'd use to approach
> this design, and to aim for one single entry point for all the stats
> reports and the flushes, for all the contexts.
We can keep the scope of the work for now limited to the testing
infrastructure and allow pg_stat_force_next_flush() to call
pgstat_report_stat() while within a transaction. This will not require
throttling.
> 0004: pg_stat_statements: modernize entry storage with pgstat kind
> The main patch. Replaces ShmemInitHash with dshash via DSM registry,
v3-0004 introduces a secondary dshash meant to track the query
dsa_pointer, but its larger purpose is to avoid having
pg_stat_statements_internal or eviction do a full scan on the
core dshash to look up its own entries; which means filtering
out entries for other kinds.
After discussing this approach with several people at recent
conferences (Lukas Fittl and Robert Haas), I think we should do
something more robust in the core stats system instead.
I have implemented an approach, which I will share in the next
revision, where every stats kind is written to its own dedicated
dshash rather than into the common one. This is a new
configuration in PgStat_KindInfo where an extension (or even a
core stats kind) can request to be placed in a dedicated hash,
else in the common hash.
I like this better, especially for kinds with unique access
patterns like pg_stat_statements, where the view function and
eviction need to iterate over all their own entries frequently.
A dedicated hash eliminates the full-scan problem without a
secondary coordination dshash.
The implementation is straightforward and does not require
significant API changes.
--
Sami Imseih
Amazon Web Services (AWS)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Khoa Nguyen | 2026-06-12 23:47:56 | Re: [PATCH] vacuumdb: Add --exclude-database option |
| Previous Message | Tom Lane | 2026-06-12 23:21:28 | Re: Use \if/\endif to remove non-libxml2 expected output in regression tests |