Re: shared-memory based stats collector - v70

From: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v70
Date: 2022-08-16 12:49:04
Message-ID: 1c56f4c6-61ce-d131-553c-297326e20bf1@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 8/15/22 4:46 PM, Greg Stark wrote:
> On Thu, 11 Aug 2022 at 02:11, Drouvot, Bertrand <bdrouvot(at)amazon(dot)com> wrote:
>> As Andres was not -1 about that idea (as it should be low cost to add
>> and maintain) as long as somebody cares enough to write something: then
>> I'll give it a try and submit a patch for it.
> I agree it would be a useful feature. I think there may be things to
> talk about here though.
>
> 1) Are you planning to go through the local hash table and
> LocalSnapshot and obey the consistency mode? I was thinking a flag
> passed to build_snapshot to request global mode might be sufficient
> instead of a completely separate function.

I think the new API should behave as PGSTAT_FETCH_CONSISTENCY_NONE (as
querying from all the databases increases the risk of having to deal
with "large" number of objects).

I've in mind to do something along those lines (still need to add some
filtering, extra check on the permission,...):

+       dshash_seq_init(&hstat, pgStatLocal.shared_hash, false);
+       while ((p = dshash_seq_next(&hstat)) != NULL)
+       {
+               Datum values[PG_STAT_GET_ALL_TABLES_STATS_COLS];
+               bool nulls[PG_STAT_GET_ALL_TABLES_STATS_COLS];
+               PgStat_StatTabEntry * tabentry = NULL;
+               MemSet(values, 0, sizeof(values));
+               MemSet(nulls, false, sizeof(nulls));
+

+               if (p->key.kind != PGSTAT_KIND_RELATION)
+                       continue;

+               if (p->dropped)
+                       continue;
+
+               stats_data = dsa_get_address(pgStatLocal.dsa, p->body);
+               LWLockAcquire(&stats_data->lock, LW_SHARED);
+               tabentry = pgstat_get_entry_data(PGSTAT_KIND_RELATION,
stats_data);
+
+
+               values[0] = ObjectIdGetDatum(p->key.dboid);
+               values[1] = ObjectIdGetDatum(p->key.objoid);
+               values[2]= DatumGetInt64(tabentry->tuples_inserted);

.

.

+               tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
+               LWLockRelease(&stats_data->lock);
+       }
+       dshash_seq_term(&hstat);

What do you think?

> 2) When I did the function attached above I tried to avoid returning
> the whole set and make it possible to process them as they arrive.

Is it the way it has been done? (did not look at your function yet)

> I
> actually was hoping to get to the point where I could start shipping
> out network data as they arrive and not even buffer up the response,
> but I think I need to be careful about hash table locking then.

If using dshash_seq_next() the already returned elements are locked.

But I guess you would like to unlock them (if you are able to process
them as they arrive)?

> 3) They key difference here is that we're returning whatever stats are
> in the hash table rather than using the catalog to drive a list of id
> numbers to look up.

Right.

> I guess the API should make it clear this is what
> is being returned

Right. I think we'll end up with a set of relations id (not their names)
and their associated stats.

> -- on that note I wonder if I've done something
> wrong because I noted a few records with InvalidOid where I didn't
> expect it.

It looks like that InvalidOid for the dbid means that the entry is for a
shared relation.

Where did you see them (while not expecting them)?

> 4) I'm currently looping over the hash table returning the records all
> intermixed. Some users will probably want to do things like "return
> all Relation records for all databases" or "return all Index records
> for database id xxx". So some form of filtering may be best or perhaps
> a way to retrieve just the keys so they can then be looked up one by
> one (through the local cache?).

I've in mind to add some filtering on the dbid (I think it could be
useful for monitoring tool with a persistent connection to one database
but that wants to pull the stats database per database).

I don't think a look up through the local cache will work if the
entry/key is related to another database the API is launched from.

> 5) On that note I'm not clear how the local cache will interact with
> these cross-database lookups. That should probably be documented...

yeah I don't think that would work (if by local cache you mean what is
in the relcache).

Regards,

--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-16 13:37:51 Re: Propose a new function - list_is_empty
Previous Message houzj.fnst@fujitsu.com 2022-08-16 11:57:17 RE: Support logical replication of DDLs