Re: Split index and table statistics into different types of stats

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Split index and table statistics into different types of stats
Date: 2022-11-02 08:58:36
Message-ID: 4cc5476f-4360-1bed-c60d-2c21adf7db13@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/1/22 1:30 AM, Andres Freund wrote:
> Hi,
>
> On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
>> Please find attached a patch proposal to split index and table statistics
>> into different types of stats.
>>
>> This idea has been proposed by Andres in a couple of threads, see [1] and
>> [2].
>
> Thanks for working on this!
>

Thanks for looking at it!

>
>
>> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
>> index 5b49cc5a09..8a715db82e 100644
>> --- a/src/backend/catalog/heap.c
>> +++ b/src/backend/catalog/heap.c
>> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>> RelationDropStorage(rel);
>>
>> /* ensure that stats are dropped if transaction commits */
>> - pgstat_drop_relation(rel);
>> + pgstat_drop_heap(rel);
>
> I don't think "heap" is a good name for these, even if there's some historical
> reasons for it. Particularly because you used "table" in some bits and pieces
> too.
>

Agree, replaced by "table" where appropriate in V3 attached.

>
>> /*
>> @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
>> void
>> pgstat_create_relation(Relation rel)
>> {
>> - pgstat_create_transactional(PGSTAT_KIND_RELATION,
>> - rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> - RelationGetRelid(rel));
>> + if (rel->rd_rel->relkind == RELKIND_INDEX)
>> + pgstat_create_transactional(PGSTAT_KIND_INDEX,
>> + rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> + RelationGetRelid(rel));
>> + else
>> + pgstat_create_transactional(PGSTAT_KIND_TABLE,
>> + rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> + RelationGetRelid(rel));
>> +}
>
> Hm - why is this best handled on this level, rather than at the caller?
>
>

Agree that it should be split in
pgstat_create_table()/pgstat_create_index() (also as it was already
split for the "drop" case): done in V3.

>> +/*
>> + * Support function for the SQL-callable pgstat* functions. Returns
>> + * the collected statistics for one index or NULL. NULL doesn't mean
>> + * that the index doesn't exist, just that there are no statistics, so the
>> + * caller is better off to report ZERO instead.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry(Oid relid)
>> +{
>> + PgStat_StatIndEntry *indentry;
>> +
>> + indentry = pgstat_fetch_stat_indentry_ext(false, relid);
>> + if (indentry != NULL)
>> + return indentry;
>> +
>> + /*
>> + * If we didn't find it, maybe it's a shared index.
>> + */
>> + indentry = pgstat_fetch_stat_indentry_ext(true, relid);
>> + return indentry;
>> +}
>> +
>> +/*
>> + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
>> + * whether the to-be-accessed index is shared or not.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
>> +{
>> + Oid dboid = (shared ? InvalidOid : MyDatabaseId);
>> +
>> + return (PgStat_StatIndEntry *)
>> + pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
>> }
>
> Do we need this split anywhere for now? I suspect not, the table case is
> mainly for the autovacuum launcher, which won't look at indexes "in isolation".
>

Yes I think so as pgstat_fetch_stat_indentry_ext() has its use case in
pgstat_copy_index_stats() (previously pgstat_copy_relation_stats()).

>
>
>> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>> PG_RETURN_INT64(result);
>> }
>>
>> +Datum
>> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
>> +{
>> + Oid relid = PG_GETARG_OID(0);
>> + int64 result;
>> + PgStat_StatIndEntry *indentry;
>> +
>> + if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
>> + result = 0;
>> + else
>> + result = (int64) (indentry->blocks_fetched);
>> +
>> + PG_RETURN_INT64(result);
>> +}
>
> We have so many copies of this by now - perhaps we first should deduplicate
> them somehow? Even if it's just a macro or such.
>

Yeah good point, a new macro has been defined for the "int64" return
case in V3 attached.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v3-0001-split_tables_indexes_stats.patch text/plain 100.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yuya Watari 2022-11-02 09:27:52 Re: [PoC] Reducing planning time when tables have many partitions
Previous Message Aleksander Alekseev 2022-11-02 08:53:26 Re: Adding doubly linked list type which stores the number of items in the list