[BUG] pg_stat_statements and extended query protocol

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: [BUG] pg_stat_statements and extended query protocol
Date: 2023-01-25 23:22:04
Message-ID: EBE6C507-9EB6-4142-9E4D-38B1673363A7@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Doing some work with extended query protocol, I encountered the same
issue that was discussed in [1]. It appears when a client is using
extended query protocol and sends an Execute message to a portal with
max_rows, and a portal is executed multiple times,
pg_stat_statements does not correctly track rows and calls.

Using the attached jdbc script, TEST.java, which can reproduce the issue
with setFetchSize of 100 with autocommit mode set to OFF. We can
see that although pg_class has 414 rows, the total call and
rows returned is 14. the first 4 * 100 fetches did not get accounted for,.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
1 | 14 | select * from pg_class
(1 row)

The execution work flow goes something like this:
ExecutorStart
ExecutorRun – which will be called multiple times to fetch from the
portal until the caller Closes the portal or the portal
runs out of rows.
ExecutorFinish
ExecutorEnd – portal is closed & pg_stat_statements stores the final rows processed

Where this breaks for pg_stat_statements is during ExecutorRun,
es_processed is reset to 0 every iteration. So by the time the portal
is closed, es_processed will only show the total from the last execute
message.

This appears to be only an issue for portals fetched
through extended query protocol and not explicit cursors
that go through simple query protocol (i.e. FETCH <cursor>)

I attached a JDBC script to repro the issue.

One potential fix I see is to introduce 2 new counters in the
ExecutionState which will track the total rows processed
and the number of calls. These counters can then be used
by pg_stat_statements. Attached is an experimental patch
which shows the correct number of rows and number of
calls.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
5 | 414 | select * from pg_class
(1 row)

[1] https://www.postgresql.org/message-id/flat/c90890e7-9c89-c34f-d3c5-d5c763a34bd8%40dunslane.net

Thanks


Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
0001-correct-pg_stat_statements-tracking-of-portals.patch application/octet-stream 5.0 KB
Test.java application/octet-stream 811 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-01-25 23:28:43 Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound (stop telling users to "vacuum that database in single-user mode")
Previous Message Jacob Champion 2023-01-25 23:22:02 Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security