Re: pg_stat_statements: faster search by queryid

From: Karina Litskevich <litskevichkarina(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stat_statements: faster search by queryid
Date: 2025-09-23 11:59:19
Message-ID: CACiT8iZM3gwBfgdTGVjKqMip4vJgubfNq6OEq564u9u66qMj-Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> I do think having the ability to look up a specific entry based on a
>> key (that is, hash_search instead of hash_seq_search) would be useful.
>
> That's a great idea, thanks! I'm going to try that and include it in
> the next version of the patch if I succeed.

Here is the second version of the patch, as promised. I used hash_search
in case all three key arguments are provided, the same way as it is done
in the entry_reset() function. The diff doesn't look very pleasant.
Basically I just moved the code that forms one tuple in a new
pg_stat_statements_handle_entry() function to use it in
pg_stat_statements_internal().

I also had a second thought about adding a new struct just to pass three
key arguments as one filter to the internal function. In the v2 patch I
just pass the arguments as they are. I'm not sure which option is better.
Anyway, it should be the same in both entry_reset() and
pg_stat_statements_internal(), so if you say adding struct pgssFilter
was a good idea, I'll rewrite the patch to use it in both
pg_stat_statements_internal() and entry_reset().

On Thu, Sep 18, 2025 at 6:33 PM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
> Yes, but my point is, if someone repeatedly lookup up pg_stat_statements
> with filters, they will end up loading the query text multiple times.
>
> for example:
> ````
> select * from pg_stat_statements where query_id in (10000, 20000, 30000);
> ```
>
> will only load the query text once to retrieve these 3 query IDs.
>
> If I instead do this, with the proposed patch:
>
> ```
> select * from pg_stat_statements(true, queryid=>10000);
> select * from pg_stat_statements(true, queryid=>20000);
> select * from pg_stat_statements(true, queryid=>30000);
>
> or
> select * from pg_stat_activity a, pg_stat_statements(true, queryid=>a.query_id);
>
> ```
> I will have to load the query text file into memory for every invocation of
> pg_stat_statements.
>

You are right. At some point, if information about multiple queries is
needed, a single select from pg_stat_statements followed by filtering
will be more efficient than calling pg_stat_statements with filters
multiple times. That's something that should be documented.

Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/

Attachment Content-Type Size
v2-0001-pg_stat_statements-add-ability-to-filter-statisti.patch text/x-patch 20.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Bruce Momjian 2025-09-23 11:58:56 Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master