Re: shared-memory based stats collector - v70

From: Andres Freund <andres(at)anarazel(dot)de>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v70
Date: 2022-08-17 23:30:53
Message-ID: 20220817233053.sppiwk4a32u3x4rd@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-08-17 15:46:42 -0400, Greg Stark wrote:
> Isn't there also a local hash table used to find the entries to reduce
> traffic on the shared hash table? Even if you don't take a snapshot
> does it get entered there? There are definitely still parts of this
> I'm working on a pretty vague understanding of :/

Yes, there is. But it's more about code that generates stats, rather than
reporting functions. While there's backend local pending stats we need to have
a refcount on the shared stats item so that the stats item can't be dropped
and then revived when those local stats are flushed.

Relevant comments from pgstat.c:

* To avoid contention on the shared hashtable, each backend has a
* backend-local hashtable (pgStatEntryRefHash) in front of the shared
* hashtable, containing references (PgStat_EntryRef) to shared hashtable
* entries. The shared hashtable only needs to be accessed when no prior
* reference is found in the local hashtable. Besides pointing to the
* shared hashtable entry (PgStatShared_HashEntry) PgStat_EntryRef also
* contains a pointer to the shared statistics data, as a process-local
* address, to reduce access costs.
*
* The names for structs stored in shared memory are prefixed with
* PgStatShared instead of PgStat. Each stats entry in shared memory is
* protected by a dedicated lwlock.
*
* Most stats updates are first accumulated locally in each process as pending
* entries, then later flushed to shared memory (just after commit, or by
* idle-timeout). This practically eliminates contention on individual stats
* entries. For most kinds of variable-numbered pending stats data is stored
* in PgStat_EntryRef->pending. All entries with pending data are in the
* pgStatPending list. Pending statistics updates are flushed out by
* pgstat_report_stat().
*

pgstat_internal.h has more details about the refcount aspect:

* Per-object statistics are stored in the "shared stats" hashtable. That
* table's entries (PgStatShared_HashEntry) contain a pointer to the actual stats
* data for the object (the size of the stats data varies depending on the
* kind of stats). The table is keyed by PgStat_HashKey.
*
* Once a backend has a reference to a shared stats entry, it increments the
* entry's refcount. Even after stats data is dropped (e.g., due to a DROP
* TABLE), the entry itself can only be deleted once all references have been
* released.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-08-17 23:38:25 Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Previous Message Andres Freund 2022-08-17 23:19:35 Re: s390x builds on buildfarm