Re: [survey] New "Stable" QueryId based on normalized query text

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com
Cc: legrand_legrand(at)hotmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [survey] New "Stable" QueryId based on normalized query text
Date: 2019-03-20 01:44:46
Message-ID: 20190320.104446.68838245.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 20 Mar 2019 00:23:30 +0000, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote in <0A3221C70F24FB45833433255569204D1FBE20A4(at)G01JPEXMBYT05>
> From: legrand legrand [mailto:legrand_legrand(at)hotmail(dot)com]
> > There are many projects that use alternate QueryId
> > distinct from the famous pg_stat_statements jumbling algorithm.
>
> I'd like to welcome the standard QueryID that DBAs and extension developers can depend on. Are you surveying the needs for you to develop the QueryID that can meet as many needs as possible?

+1 to the necessity.

There's a similar thread about adding queryid in pg_stat_activity.

https://www.postgresql.org/message-id/CA%2B8PKvQnMfOE-c3YLRwxOsCYXQDyP8VXs6CDtMZp1V4%3DD4LuFA%40mail.gmail.com

> > needs.1: stable accross different databases,
>
> Does this mean different database clusters, not different databases in a single database cluster?

Does this mean you want different QueryID for the same-looking
query for another database in the same cluster?

> needs.5: minimal overhead to calculate
> needs.6: doesn't change across database server restarts
> needs.7: same value on both the primary and standby?
>
>
> > norm.9: comments aware
>
> Is this to distinguish queries that have different comments for optimizer hints? If yes, I agree.

Or, any means to give an explict query id? I saw many instances
of query that follows a comment describing a query id.

> needs.2: doesn't change after database or object rebuild,
> needs.3: search_path / schema independant,

pg_stat_statements even ignores table/object/column names.

> needs.4: pg version independant (as long as possible),

I don't think this cannot be guaranteed.

> norm.1: case insensitive
> norm.2: blank reduction
> norm.3: hash algoritm ?
> norm.4: CURRENT_DATE, CURRENT_TIME, LOCALTIME, LOCALTIMESTAMP not normalized
> norm.5: NULL, IS NULL not normalized ?
> norm.6: booleans t, f, true, false not normalized
> norm.7: order by 1,2 or group by 1,2 should not be normalized
> norm.8: pl/pgsql anonymous blocks not normalized

pg_stat_statements can be the base of the discussion on them.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-03-20 02:07:31 Re: BUG #15668: Server crash in transformPartitionRangeBounds
Previous Message Heikki Linnakangas 2019-03-20 01:10:51 Re: Sparse bit set data structure