Cleaning up PREPARE query strings?

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Cleaning up PREPARE query strings?
Date: 2025-12-24 15:43:13
Message-ID: aUwKEWGge5jWtaRX@jrouhaud
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Currently prepared statements store the whole query string that was submitted
by the client at the time of the PREPARE as-is. This is usually fine, but if
that query was a multi-statement query string it can lead to a waste of memory.

There are some pattern that are more likely to have such overhead, mine being
an application with a fixed set of prepared statements that are sent at the
connection start using a single query to avoid extra round trips.

One naive example of the outcome is as follow:

#= PREPARE s1 AS SELECT 1\; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE
relname = $1\; PREPARE s3(int, int) AS SELECT $1 + $2;
PREPARE
PREPARE
PREPARE

=# SELECT name, statement FROM pg_prepared_statements ;
name | statement
------+----------------------------------------------------------------------------
s1 | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
s2 | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
s3 | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
(3 rows)

The more prepared statements you have the bigger the waste. This is also not
particularly readable for people who want to rely on the pg_prepared_statements
views, as you need to parse the query again yourself to figure out what exactly
is the associated query.

I assume that some other patterns could lead to other kind of problems. For
instance if the query string includes a prepared statement and some DML, it
could lead some automated program to replay both the PREPARE and DML when only
the PREPARE was intended.

I'm attaching a POC patch to fix that behavior by teaching PREPARE to clean the
passed query text the same way as pg_stat_statements. Since it relies on the
location saved during parsing the overhead should be minimal, and only present
when some space can actually be saved. Note that I first tried to have the
cleanup done in CreateCachedPlan so that it's done everywhere including things
like the extended protocol but this lead to too many issues so I ended up doing
it for an explicit PREPARE statement only.

With this patch applied, the above scenario gives this new output:

=# SELECT name, statement FROM pg_prepared_statements ;
name | statement
------+----------------------------------------------------
s1 | PREPARE s1 AS SELECT 1
s2 | PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1
s3 | PREPARE s3(int, int) AS SELECT $1 + $2
(3 rows)

One possible issue is that any comment present at the beginning of the query
text would be discarded. I'm not sure if that's something used by e.g.
pg_hint_plan, but if yes it's always possible to put the statement in front of
the SELECT (or other actual first keyword) rather than the PREPARE itself to
preserve it.

Attachment Content-Type Size
v1-0001-Cleanup-explicit-PREPARE-query-strings.patch text/plain 11.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-12-24 16:21:00 Re: Cleaning up PREPARE query strings?
Previous Message Fujii Masao 2025-12-24 15:35:18 Re: Fix wrong reference in pg_overexplain's doc