Re: Multiple Query IDs for a rewritten parse tree

From: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Multiple Query IDs for a rewritten parse tree
Date: 2022-01-31 09:59:17
Message-ID: 5a15bc96-93af-8788-1fcd-b9490add20c1@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/28/22 9:51 PM, Dmitry Dolgov wrote:
>> On Fri, Jan 21, 2022 at 11:33:22AM +0500, Andrey V. Lepikhov wrote:
>> Registration of an queryId generator implemented by analogy with extensible
>> methods machinery.
>
> Why not more like suggested with stakind and slots in some data
> structure? All of those generators have to be iterated anyway, so not
> sure if a hash table makes sense.
Maybe. But it is not obvious. We don't really know, how many extensions
could set an queryId.
For example, adaptive planning extensions definitely wants to set an
unique id (for example, simplistic counter) to trace specific
{query,plan} across all executions (remember plancache too). And they
would register a personal generator for such purpose.
>
>> Also, I switched queryId to int64 type and renamed to
>> 'label'.
>
> A name with "id" in it would be better I believe. Label could be think
> of as "the query belongs to a certain category", while the purpose is
> identification.
I think, it is not a full true. Current jumbling generates not unique
queryId (i hope, intentionally) and pg_stat_statements uses queryId to
group queries into classes.
For tracking specific query along execution path it performs additional
efforts (to remember nesting query level, as an example).
BTW, before [1], I tried to improve queryId, that can be stable for
permutations of tables in 'FROM' section and so on. It would allow to
reduce a number of pg_stat_statements entries (critical factor when you
use an ORM, like 1C for example).
So, i think queryId is an Id and a category too.
>
>> 2. We need a custom queryId, that is based on a generated queryId (according
>> to the logic of pg_stat_statements).
>
> Could you clarify?
pg_stat_statements uses origin queryId and changes it for a reason
(sometimes zeroed it, sometimes not). So you can't use this value in
another extension and be confident that you use original value,
generated by JumbleQuery(). Custom queryId allows to solve this problem.
>
>> 4. We should reserve position of default in-core generator
>
> From the discussion above I was under the impression that the core
> generator should be distinguished by a predefined kind.
Yes, but I think we should have a range of values, enough for use in
third party extensions.
>
>> 5. We should add an EXPLAIN hook, to allow an extension to print this custom
>> queryId.
>
> Why? It would make sense if custom generation code will be generating
> some complex structure, but the queryId itself is still a hash.
>
Extension can print not only queryId, but an explanation of a kind,
maybe additional logic.
Moreover why an extension can't show some useful monitoring data,
collected during an query execution, in verbose mode?

[1]
https://www.postgresql.org/message-id/flat/e50c1e8f-e5d6-5988-48fa-63dd992e9565%40postgrespro.ru
--
regards,
Andrey Lepikhov
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Ladhe 2022-01-31 11:10:25 Re: refactoring basebackup.c
Previous Message Michael Banck 2022-01-31 08:43:27 Re: CREATEROLE and role ownership hierarchies