Re: pg_stat_statements and "IN" conditions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Pavel Trukhanov <pavel(dot)trukhanov(at)gmail(dot)com>
Subject: Re: pg_stat_statements and "IN" conditions
Date: 2022-03-14 14:17:57
Message-ID: CA+TgmoYG+nL1-3xmejYx-dm__7JMpCaptYkCVernS7wrriHQ+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 12, 2022 at 9:11 AM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> Here is the limited version of list collapsing functionality, which
> doesn't utilize eval_const_expressions and ignores most of the stuff
> except ArrayExprs. Any thoughts/more suggestions?

The proposed commit message says this commit intends to "Make Consts
contribute nothing to the jumble hash if they're part of a series and
at position further that specified threshold." I'm not sure whether
that's what the patch actually implements because I can't immediately
understand the new logic you've added, but I think if we did what that
sentence said then, supposing the threshold is set to 1, it would
result in producing the same hash for "x in (1,2)" that we do for "x
in (1,3)" but a different hash for "x in (2,3)" which does not sound
like what we want. What I would have thought we'd do is: if the list
is all constants and long enough to satisfy the threshold then nothing
in the list gets jumbled.

I'm a little surprised that there's not more context-awareness in this
code. It seems that it applies to every ArrayExpr found in the query,
which I think would extend to cases beyond something = IN(whatever).
In particular, any use of ARRAY[] in the query would be impacted. Now,
the comments seem to imply that's pretty intentional, but from the
user's point of view, WHERE x in (1,3) and x = any(array[1,3]) are two
different things. If anything like this is to be adopted, we certainly
need to be precise about exactly what it is doing and which cases are
covered. I thought of looking at the documentation to see whether
you'd tried to clarify this there, and found that you hadn't written
any.

In short, I think this patch is not really very close to being in
committable shape even if nobody were objecting to the concept.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2022-03-14 14:36:07 Re: support for MERGE
Previous Message Maxim Orlov 2022-03-14 14:16:32 Re: Add 64-bit XIDs into PostgreSQL 15