Re: query_id, pg_stat_activity, extended query protocol

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: kaido vaikla <kaido(dot)vaikla(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: query_id, pg_stat_activity, extended query protocol
Date: 2024-04-27 19:08:41
Message-ID: 724348C9-8023-41BC-895E-80634E79A538@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> But simplistic case with a prepared statement shows how the value of
>> queryId can be changed if you don't acquire all the objects needed for
>> the execution:

>> CREATE TABLE test();
>> PREPARE name AS SELECT * FROM test;
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
>> DROP TABLE test;
>> CREATE TABLE test();
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;

> Hmm, you raise a good point. Isn't this a fundamental problem
> with prepared statements? If there is DDL on the
> relations of the prepared statement query, shouldn't the prepared
> statement be considered invalid at that point and raise an error
> to the user?

I tested v1 thoroughly.

Using the attached JDBC script for testing, I added some logging of the queryId
being reported by the patch and added a breakpoint after sync [1] which at that
point the locks are released on the table. I then proceeded to drop and recreate the table
and observed that the first bind after recreating the table still reports the
old queryId but the execute reports the correct queryId. This is because
the bind still has not had a chance to re-parse and re-plan after the
cache invalidation.

2024-04-27 13:51:15.757 CDT [43483] LOG: duration: 21322.475 ms execute S_1: select pg_sleep(10)
2024-04-27 13:51:21.591 CDT [43483] LOG: duration: 0.834 ms parse S_2: select from tab1 where id = $1
2024-04-27 13:51:21.591 CDT [43483] LOG: query_id = -192969736922694368
2024-04-27 13:51:21.592 CDT [43483] LOG: duration: 0.729 ms bind S_2: select from tab1 where id = $1
2024-04-27 13:51:21.592 CDT [43483] LOG: query_id = -192969736922694368
2024-04-27 13:51:21.592 CDT [43483] LOG: duration: 0.032 ms execute S_2: select from tab1 where id = $1
2024-04-27 13:51:32.501 CDT [43483] LOG: query_id = -192969736922694368
2024-04-27 13:51:32.502 CDT [43483] LOG: duration: 0.342 ms bind S_2: select from tab1 where id = $1
2024-04-27 13:51:32.502 CDT [43483] LOG: query_id = -192969736922694368
2024-04-27 13:51:32.502 CDT [43483] LOG: duration: 0.067 ms execute S_2: select from tab1 where id = $1
2024-04-27 13:51:42.613 CDT [43526] LOG: query_id = -4766379021163149612
-- recreate the tables
2024-04-27 13:51:42.621 CDT [43526] LOG: duration: 8.488 ms statement: drop table if exists tab1;
2024-04-27 13:51:42.621 CDT [43526] LOG: query_id = 7875284141628316369
2024-04-27 13:51:42.625 CDT [43526] LOG: duration: 3.364 ms statement: create table tab1 ( id int );
2024-04-27 13:51:42.625 CDT [43526] LOG: query_id = 2967282624086800441
2024-04-27 13:51:42.626 CDT [43526] LOG: duration: 0.936 ms statement: insert into tab1 values (1);

-- this reports the old query_id
2024-04-27 13:51:45.058 CDT [43483] LOG: query_id = -192969736922694368

2024-04-27 13:51:45.059 CDT [43483] LOG: duration: 0.913 ms bind S_2: select from tab1 where id = $1
2024-04-27 13:51:45.059 CDT [43483] LOG: query_id = 3010297048333693297
2024-04-27 13:51:45.059 CDT [43483] LOG: duration: 0.096 ms execute S_2: select from tab1 where id = $1
2024-04-27 13:51:46.777 CDT [43483] LOG: query_id = 3010297048333693297
2024-04-27 13:51:46.777 CDT [43483] LOG: duration: 0.108 ms bind S_2: select from tab1 where id = $1
2024-04-27 13:51:46.777 CDT [43483] LOG: query_id = 3010297048333693297
2024-04-27 13:51:46.777 CDT [43483] LOG: duration: 0.024 ms execute S_2: select from tab1 where id = $1

The easy answer is to not report queryId during the bind message, but I will look
at what else can be done here as it's good to have a queryId reported in this scenario
for cases there are long planning times and we rather not have those missed in
pg_stat_activity sampling.

[1] https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c#L4877

Regards,

Sami

Attachment Content-Type Size
Test.java application/octet-stream 813 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-27 22:45:33 Re: DROP OWNED BY fails to clean out pg_init_privs grants
Previous Message Daniel Gustafsson 2024-04-27 18:33:55 Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?