| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
| Cc: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Bug in pg_stat_statements |
| Date: | 2025-10-24 21:04:20 |
| Message-ID: | CAA5RZ0uU2p_BXevGL17N8bva5Tvke+3xTcdpYbVegyb+kDfj_Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> > > > Having said
> > > > that it seems that another solution would be to check for duplicated constants
> > > > in fill_in_constant_length
> > >
> > > Yes, I started thinking along these lines as well, where we check for
> > > duplicates
> > > in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
> > > have a squashable list, which is what I have attached with updated test cases.
> > >
> > > This means we do need to scan the locations one extra time during jumbling,
> > > but I don't see that as a problem. Maybe there is another better way?
> >
> > Why? What I hand in mind is something like this, after a quick test it seems to
> > be able to address both the original case and the one you've discovered.
>
> ahh, what you have works because clocations is sorted by location before
> the loop starts in `fill_in_constant_lengths` , so comparing locations
> makes sense in this case. I am OK with this.
v3-0001 does what Dimiti suggests with some slight tweaks:
1/ Moving "last_loc = loc;" earlier in the loop
2/ removing a comment for fill_in_constant_lengths that was
an oversight from 0f65f3eec, as we no longer assume that
the constant location is -1 when entering that routine. Also
added the test cases that were discovered, besides the
one from the original report.
> I was really hoping that the fix could be inside of query jumbling, as I think
> pg_stat_statements or any consumers of query jumbling don't need to
> care more about squashing ( or other internals of jumbling ).
> So now I also wonder if we should also move:
>
> ```
> static char *generate_normalized_query(JumbleState *jstate, const char *query,
> int query_loc, int *query_len_p);
>
> static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
> int query_loc);
> ```
>
> from pg_stat_statements.c to queryjumblefuncs.c?
>
>
v3-0002 moved both generate_normalized_query and fill_in_constant_lengths
to queryjumblefuncs.c, as these routines deal with the internal of jumbling
and should not be a concern of pg_stat_statements. I think this is a good
cleanup, but others may have different opinions on this.
--
Sami Imseih
Amazon Web Services (AWS)
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0001-pg_stat_statements-Fix-duplicate-constant-locatio.patch | application/octet-stream | 8.5 KB |
| v3-0002-pg_stat_statements-move-out-jumble-specific-routi.patch | application/octet-stream | 21.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Geoghegan | 2025-10-24 21:13:53 | Re: another autovacuum scheduling thread |
| Previous Message | Peter Geoghegan | 2025-10-24 20:56:13 | Re: use SIMD in GetPrivateRefCountEntry() |