Re: Bug in pg_stat_statements

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Bug in pg_stat_statements
Date: 2025-10-24 21:04:20
Message-ID: CAA5RZ0uU2p_BXevGL17N8bva5Tvke+3xTcdpYbVegyb+kDfj_Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > > > Having said
> > > > that it seems that another solution would be to check for duplicated constants
> > > > in fill_in_constant_length
> > >
> > > Yes, I started thinking along these lines as well, where we check for
> > > duplicates
> > > in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
> > > have a squashable list, which is what I have attached with updated test cases.
> > >
> > > This means we do need to scan the locations one extra time during jumbling,
> > > but I don't see that as a problem. Maybe there is another better way?
> >
> > Why? What I hand in mind is something like this, after a quick test it seems to
> > be able to address both the original case and the one you've discovered.
>
> ahh, what you have works because clocations is sorted by location before
> the loop starts in `fill_in_constant_lengths` , so comparing locations
> makes sense in this case. I am OK with this.

v3-0001 does what Dimiti suggests with some slight tweaks:

1/ Moving "last_loc = loc;" earlier in the loop
2/ removing a comment for fill_in_constant_lengths that was
an oversight from 0f65f3eec, as we no longer assume that
the constant location is -1 when entering that routine. Also
added the test cases that were discovered, besides the
one from the original report.

> I was really hoping that the fix could be inside of query jumbling, as I think
> pg_stat_statements or any consumers of query jumbling don't need to
> care more about squashing ( or other internals of jumbling ).
> So now I also wonder if we should also move:
>
> ```
> static char *generate_normalized_query(JumbleState *jstate, const char *query,
> int query_loc, int *query_len_p);
>
> static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
> int query_loc);
> ```
>
> from pg_stat_statements.c to queryjumblefuncs.c?
>
>

v3-0002 moved both generate_normalized_query and fill_in_constant_lengths
to queryjumblefuncs.c, as these routines deal with the internal of jumbling
and should not be a concern of pg_stat_statements. I think this is a good
cleanup, but others may have different opinions on this.

--
Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
v3-0001-pg_stat_statements-Fix-duplicate-constant-locatio.patch application/octet-stream 8.5 KB
v3-0002-pg_stat_statements-move-out-jumble-specific-routi.patch application/octet-stream 21.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-10-24 21:13:53 Re: another autovacuum scheduling thread
Previous Message Peter Geoghegan 2025-10-24 20:56:13 Re: use SIMD in GetPrivateRefCountEntry()