Re: Refactor query normalization into core query jumbling

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, zengman <zengman(at)halodbtech(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject: Re: Refactor query normalization into core query jumbling
Date: 2026-03-26 01:34:02
Message-ID: CAP53PkzCzeDpg6dCzVrU17LCiCmRSWfEt-5dr4y+VHnTxKr_9w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Sami,

On Mon, Mar 16, 2026 at 7:12 PM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
> > Here is v4.
> >
> > 0001 - addresses the comments made by Bertrand.
> > 0002 - makes JumbleState a constant when passed to post_parse_analyze
> > and updates all downstream code that consume the JumbleState. This
> > means we now need to copy the locations from JState into a local/temporary
> > array when generating the normalized string.

In 0001:

> diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
> > index 87db8dc1a32..d4b26202c47 100644
> --- a/src/backend/nodes/queryjumblefuncs.c
> +++ b/src/backend/nodes/queryjumblefuncs.c
> ...
> @@ -773,3 +775,249 @@ _jumbleRangeTblEntry_eref(JumbleState *jstate,
> +
> + /* we don't want to re-emit any escape string warnings */
> + yyextra.escape_string_warning = false;
> +

I don't think this is needed anymore, as of
45762084545ec14dbbe66ace1d69d7e89f8978ac.

> +/*
> + * Callback to generate a normalized version of the query string that will be used to
> + * represent all similar queries.
> + *

I don't think the term "Callback" makes sense here - I think you could
just keep the original wording.

In 0002:

> diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
> index d4b26202c47..fe8f0ccd278 100644
> --- a/src/backend/nodes/queryjumblefuncs.c
> +++ b/src/backend/nodes/queryjumblefuncs.c
> ...
> @@ -813,14 +815,20 @@ SetConstantLengths(JumbleState *jstate, const char *query,
> core_YYSTYPE yylval;
> YYLTYPE yylloc;
>
> + if (jstate->clocations_count == 0)
> + return NULL;
> +
> + /* Copy constant locations to avoid modifying jstate */
> + locs = palloc(jstate->clocations_count * sizeof(LocationLen));
> + memcpy(locs, jstate->clocations, jstate->clocations_count * sizeof(LocationLen));
> +

You could use palloc_array for locs here.

> @@ -938,12 +948,13 @@ GenerateNormalizedQuery(JumbleState *jstate, const char *query,
> last_off = 0, /* Offset from start for previous tok */
> last_tok_len = 0; /* Length (in bytes) of that tok */
> int num_constants_replaced = 0;
> + LocationLen *locs = NULL;
>
> /*
> * Set constants' lengths in JumbleState, as only locations are set during
> * DoJumble(). Note this also ensures the items are sorted by location.
> */
> - SetConstantLengths(jstate, query, query_loc);
> + locs = SetConstantLengths(jstate, query, query_loc);

I think we should update the comment here to reflect the fact that
we're no longer modifying JumbleState.

Otherwise these patches look good - it'd be nice to still get this
into 19 so we have less code duplication across the different
extensions working with normalized query text.

Thanks,
Lukas

--
Lukas Fittl

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-03-26 02:22:25 Re: Fixes inconsistent behavior in vacuum when it processes multiple relations
Previous Message Michael Paquier 2026-03-26 01:31:19 Re: Track skipped tables during autovacuum and autoanalyze