Re: Refactor query normalization into core query jumbling

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, zengman <zengman(at)halodbtech(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject: Re: Refactor query normalization into core query jumbling
Date: 2026-03-30 05:21:46
Message-ID: CAP53Pkwb2f288Ji2s9ZZTL0_PEyYCZpQjBtD63X_s+yUB3YORA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 27, 2026 at 6:42 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> > On Mar 28, 2026, at 2:09, Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> > I agree that ComputeConstantLengths should be in core. This one is
> > not a gray area IMO. The query jumble already records constant locations,
> > but leaves the lengths unset. ComputeConstantLengths is just the
> > completion of that work. There could be no other interpretation,
> > unlike generate_normalized_query, of what the lengths should be.
>
> This is an interesting remark. One problem with any patches presented yet is that we don’t attach the lengths as part of the in-core query jumbling procedure: we plug the lengths later using the same structure as the query jumbling. It seems to me that this is half-baked overall. I think that we don’t want to pay the extra length computation in the core query jumbling at the end, then why does it make sense to include the lengths in the JumbleState structure at all? Shouldn’t we use a different structure filled inside PGSS for this purpose rather than reuse the same thing for PGSS and the in-core part (don’t have the code in front of me now).

I see where you're coming from on that, but I don't think we can
remove anything here in practice:

typedef struct LocationLen
{
int location; /* Required */
int length; /* Set by _jumbleElements */
bool squashed; /* Set by RecordConstLocation called from
_jumbleElements */
bool extern_param; /* Set by RecordConstLocation called
from _jumbleParam */
} LocationLen;

typedef struct JumbleState
{
unsigned char *jumble; /* Required */
Size jumble_len; /* Required */
LocationLen *clocations; /* Required to track constant locations */
int clocations_buf_size; /* Required to track constant
locations */
int clocations_count; /* Required to track constant locations */
int highest_extern_param_id; /* Set by _jumbleParam */
bool has_squashed_lists; /* Set by _jumbleElements */
unsigned int pending_nulls; /* Required */
Size total_jumble_len; /* Required */
} JumbleState;

The only refactoring I could think of is to write out the
_jumbleElements information separately, then we could actually drop
length, and maybe squashed, from LocationLen. But I'm not sure that
really buys us much? It would be more clear I guess, because the mixed
locations of where length gets set is indeed a bit odd.

I still think it'd be reasonable for us to include
ComputeConstantLengths in core to complete the picture of what we're
doing with _jumbleElements and the length field already anyway. Its
basically a way to fully hydrate the partially filled out JumbleState
from the initial jumble.

Thanks,
Lukas

--
Lukas Fittl

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2026-03-30 05:40:49 Re: [PATCH] Fix unexpected loss of DEFERRABLE property after toggling NOT ENFORCED / ENFORCED
Previous Message Amit Langote 2026-03-30 04:55:07 Re: Eliminating SPI / SQL from some RI triggers - take 3