| From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Cleaning up PREPARE query strings? |
| Date: | 2025-12-24 15:43:13 |
| Message-ID: | aUwKEWGge5jWtaRX@jrouhaud |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
Currently prepared statements store the whole query string that was submitted
by the client at the time of the PREPARE as-is. This is usually fine, but if
that query was a multi-statement query string it can lead to a waste of memory.
There are some pattern that are more likely to have such overhead, mine being
an application with a fixed set of prepared statements that are sent at the
connection start using a single query to avoid extra round trips.
One naive example of the outcome is as follow:
#= PREPARE s1 AS SELECT 1\; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE
relname = $1\; PREPARE s3(int, int) AS SELECT $1 + $2;
PREPARE
PREPARE
PREPARE
=# SELECT name, statement FROM pg_prepared_statements ;
name | statement
------+----------------------------------------------------------------------------
s1 | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
s2 | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
s3 | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
(3 rows)
The more prepared statements you have the bigger the waste. This is also not
particularly readable for people who want to rely on the pg_prepared_statements
views, as you need to parse the query again yourself to figure out what exactly
is the associated query.
I assume that some other patterns could lead to other kind of problems. For
instance if the query string includes a prepared statement and some DML, it
could lead some automated program to replay both the PREPARE and DML when only
the PREPARE was intended.
I'm attaching a POC patch to fix that behavior by teaching PREPARE to clean the
passed query text the same way as pg_stat_statements. Since it relies on the
location saved during parsing the overhead should be minimal, and only present
when some space can actually be saved. Note that I first tried to have the
cleanup done in CreateCachedPlan so that it's done everywhere including things
like the extended protocol but this lead to too many issues so I ended up doing
it for an explicit PREPARE statement only.
With this patch applied, the above scenario gives this new output:
=# SELECT name, statement FROM pg_prepared_statements ;
name | statement
------+----------------------------------------------------
s1 | PREPARE s1 AS SELECT 1
s2 | PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
| relname = $1
s3 | PREPARE s3(int, int) AS SELECT $1 + $2
(3 rows)
One possible issue is that any comment present at the beginning of the query
text would be discarded. I'm not sure if that's something used by e.g.
pg_hint_plan, but if yes it's always possible to put the statement in front of
the SELECT (or other actual first keyword) rather than the PREPARE itself to
preserve it.
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Cleanup-explicit-PREPARE-query-strings.patch | text/plain | 11.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2025-12-24 16:21:00 | Re: Cleaning up PREPARE query strings? |
| Previous Message | Fujii Masao | 2025-12-24 15:35:18 | Re: Fix wrong reference in pg_overexplain's doc |