Re: query_id, pg_stat_activity, extended query protocol

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: kaido vaikla <kaido(dot)vaikla(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: query_id, pg_stat_activity, extended query protocol
Date: 2024-04-24 01:40:45
Message-ID: F4F147FB-0E8D-4832-A41E-BF9A09F87EF1@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I am also a bit surprised with the choice of using the first Query
> available in the list for the ID, FWIW.

IIUC, the query trees returned from QueryRewrite
will all have the same queryId, so it appears valid to
use the queryId from the first tree in the list. Right?

Here is an example I was working with that includes user-defined rules
that has a list with more than 1 tree.

postgres=# explain (verbose, generic_plan) insert into mytab values ($1) RETURNING pg_sleep($1), id ;
QUERY PLAN
-----------------------------------------------------------
Insert on public.mytab (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep(($1)::double precision), mytab.id
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425

Insert on public.mytab2 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425

Insert on public.mytab3 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425

Insert on public.mytab4 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425
(20 rows)

> Did you consider using \bind to show how this behaves in a regression
> test?

Yes, this is precisely how I tested. Without the patch, I could not
see a queryId after 9 seconds of a pg_sleep, but with the patch it
appears. See the test below.

## test query
select pg_sleep($1) \bind 30

## unpatched
postgres=# select
query_id,
query,
now()-query_start query_duration,
state
from pg_stat_activity where pid <> pg_backend_pid()
and state = 'active';
query_id | query | query_duration | state
----------+----------------------+-----------------+--------
| select pg_sleep($1) +| 00:00:08.604845 | active
| ; | |
(1 row)

## patched

postgres=# truncate table large;^C
postgres=# select
query_id,
query,
now()-query_start query_duration,
state
from pg_stat_activity where pid <> pg_backend_pid()
and state = 'active';
query_id | query | query_duration | state
---------------------+----------------------+----------------+--------
2433215470630378210 | select pg_sleep($1) +| 00:00:09.6881 | active
| ; | |
(1 row)

For exec_execute_message, I realized that to report queryId for
Utility and non-utility statements, we need to report the queryId
inside the portal routines where PlannedStmt contains the queryId.

Attached is the first real attempt at the fix.

Regards,

Sami

Attachment Content-Type Size
0001-v1-Fix-Extended-QUery-Protocol-handling-of-queryId.patch application/octet-stream 2.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-04-24 02:32:09 Re: Streaming I/O, vectored I/O (WIP)
Previous Message Thomas Munro 2024-04-23 23:43:12 Re: Requiring LLVM 14+ in PostgreSQL 18