Re: Fix pg_stat_statements display of normalized FETCH counts

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: Fix pg_stat_statements display of normalized FETCH counts
Date: 2026-05-12 06:07:30
Message-ID: F7B91E48-790C-4E61-9DDC-322526BFFED8@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On May 12, 2026, at 13:42, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, May 11, 2026 at 02:13:27PM +0800, Chao Li wrote:
>> Then the query text is shown as the normalized “fetch $1 c”. This
>> seems incorrect to me, because the representative query text should
>> not depend on the execution order of FETCH statements.
>
> This is an incorrect expectation. These query patterns are grouped
> together because they are represented the same way at their Node level
> after deparsing, and PGSS uses the first query string it finds when
> inserting the data of a query into a new slot. Specifying only a
> cursor name means a forward fetch with FETCH_KEYWORD_NONE, for one
> tuple. Specifying an integer (with or without from_in) and a cursor
> name means a forward fetch with FETCH_KEYWORD_NONE, for a specified
> number of tuples. The whole point is to normalize around the number
> of tuples all these queries. Both queries mean the same thing, once
> FetchStmt.howMany is moved into the "ignore" area of query jumbling.
>
>> The attached patch tries to fix this by adding a query_normalized
>> flag to pgssEntry, which records whether the stored representative
>> query text is already normalized. With this flag, if FETCH c is
>> executed first and stores an unnormalized query string, a later
>> FETCH 2 c can replace it with the normalized query string.
>
> Nope, I don't think that this is something we need to act on. Note
> that adding an extra generate_normalized_query() is not an acceptable
> thing to do: this has a performance impact and we don't want to make
> PGSS heavier than it is today. So it is inefficient, for one.
>
> The correct thing to do if we'd want to make the difference between
> the two cases would be to add a new value to FetchDirectionKeywords,
> and assign that to the "cursor_name" and "from_in cursor_name" case
> (say a new FETCH_KEYWORD_SINGLE?) in gram.y. I don't think that we
> need to do something here as this does not really represent a gain in
> terms of monitoring (aka more normalization is better to me here), but
> the new value would be the correct thing to do if it happens that
> folks want this difference to show up.
> --
> Michael

Actually, I’m still studying how to improve this patch, so your input is very timely.

If the “first query wins” behavior for the representative query text is intentional here, then I’m fine with withdrawing this patch. In any case, debugging this taught me a lot about how pg_stat_statements works.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message SungJun Jang 2026-05-12 06:09:49 Remove invalid SS2/SS3 handling from EUC-KR routines
Previous Message Tatsuo Ishii 2026-05-12 06:03:28 Re: Row pattern recognition