Re: shared-memory based stats collector - v70

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v70
Date: 2022-07-21 15:07:55
Message-ID: CAM-w4HMhHAxjpXNVMZ5WhULHpd3+LuP27zaSKSFnzZeRwHZTjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 20 Jul 2022 at 15:09, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Each backend only had stats for things it touched. But the stats collector read all files at startup into hash tables and absorbed all generated stats into those as well.

Fascinating. I'm surprised this didn't raise issues previously for
people with millions of tables. I wonder if it wasn't causing issues
and we just didn't hear about them because there were other bigger
issues :)

> >All that said -- having all objects loaded in shared memory makes my
> >work way easier.
>
> What are your trying to do?

I'm trying to implement an exporter for prometheus/openmetrics/etc
that dumps directly from shared memory without going through the SQL
backend layer. I believe this will be much more reliable, lower
overhead, safer, and consistent than writing SQL queries.

Ideally I would want to dump out the stats without connecting to each
database. I suspect that would run into problems where the schema
really adds a lot of information (such as which table each index is on
or which table a toast relation is for. There are also some things
people think of as stats that are maintained in the catalog such as
reltuples and relpages. So I'm imagining this won't strictly stay true
in the end.

It seems like just having an interface to iterate over the shared hash
table and return entries one by one without filtering by database
would be fairly straightforward and I would be able to do most of what
I want just with that. There's actually enough meta information in the
stats entries to be able to handle them as they come instead of trying
to process look up specific stats one by one.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2022-07-21 15:10:52 Re: [PATCH v1] eliminate duplicate code in table.c
Previous Message Junwang Zhao 2022-07-21 15:05:38 Re: [PATCH v1] eliminate duplicate code in table.c