Re: Bug in pg_stat_statements

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Bug in pg_stat_statements
Date: 2025-10-24 00:49:55
Message-ID: CAA5RZ0tn08nf5o3gt0jSEoASv7odTseDvuvnUWzxx5mRmu0GuQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> In PG18 gthere is bug either in pg_stat_statements.c, either in
> queryjumblefuncs.c.
>
> Repro (you need Postgres build with asserts and pg_stat_statements
> included in shared_preload_librarties:
>
> create function f(a integer[], out x integer, out y integer) returns
> record as $$ begin
> x = a[1];
> y = a[2];
> end;
> $$ language plpgsql;
>
> create table t(x integer);
> select ((f(array[1,2]))).* from t;
>
> The problems is caused by ((...)).* which cause to include the same
> array literal twice in JumbleQuery.
> But pg_stat_stataement assumes that any occurrence of constant is unique
> and it cause assertion failure here:
>
> /* Copy next chunk (what precedes the next constant) */
> len_to_wrt = off - last_off;
> len_to_wrt -= last_tok_len;
> Assert(len_to_wrt >= 0);
>
> So we should either exclude duplicates in RecordConstLocation
> (queryjumblefuncs.c)

You are correct in the analysis that this is because we are
recording the same constant multiple times, this causes the
len_to_write to be 0 on the second occurance of the duplicated
location. I confirmed that this was introduced in the constant list
squashing per 0f65f3eec.

The same location being recorded twice will occur because of
row expansion, therefore the same location will be jumbled twice,
as observed in the explain output.
```
postgres=# explain verbose select ((f(array[1,2]))).* from t;
QUERY PLAN
----------------------------------------------------------------
Seq Scan on public.t (cost=0.00..1310.50 rows=2550 width=8)
Output: (f('{1,2}'::integer[])).x, (f('{1,2}'::integer[])).y
Query Identifier: 5305868219688930530
(3 rows)

```

Before 0f65f3eec, this was not an issue, because we recorded
all the locations.

I believe the correct answer is to fix this in queryjumblefuncs.c
and more specifically inside _jumbleElements. We should not
record the same location twice. We can do this by tracking the
start location of the last recorded constant location, and we
can skip recording it if we see it again. See the attached
which also includes test cases.

Also, inside the loop of generate_normalized_query in
pg_stat_statements we are asserting len_to_wrt >= 0, so we
are OK with memcpy of 0 bytes. This is valid, but does
not seem correct.

```
Assert(len_to_wrt > 0);
memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
```

Should we continue the loop and skip the memcpy
if (len_to_wrt == 0) instead?

--

Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
0001-Fix-jumbling-of-squashed-lists-with-row-expansion.patch application/octet-stream 6.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2025-10-24 00:51:14 Avoid handle leak (src/bin/pg_ctl/pg_ctl.c)
Previous Message Ranier Vilela 2025-10-24 00:37:21 Avoid resource leak (src/test/regress/pg_regress.c)